TY - GEN
T1 - Video Modification in Drone and Satellite Imagery
AU - Reale, Michael J.
AU - Murphy, Daniel
AU - Cornacchia, Maria
AU - Madera, Jamie Vazquez
N1 - Publisher Copyright: © 2024 SPIE.
PY - 2024
Y1 - 2024
N2 - The ability to create and detect synthetic video is becoming critically important to scene understanding. Techniques for synthetic manipulation and augmentation of data increase diversity within available datasets, while not requiring laborious labeling efforts. That is, the ability to create synthetic video can enable augmentation of small realistic datasets on which to further train Artificial Intelligence and Machine Learning (AI/ML) algorithms. Thus, it may be desirable to add, remove, or modify vehicles in satellite and overhead video. In our previous work, we leveraged generative adversarial networks (GANs) to transform cars into trucks (and vice versa) in static images. We utilized an attention-based masking approach that assists the network in transformation of the object and not background. In addition, we demonstrated the benefits of numerous data augmentation procedures, including presenting a new artificial dataset of vehicles from an aerial perspective and introducing novel augmentation techniques appropriate for our network architectures. This work extends the applied techniques from still imagery to video. We employ a few different architectures: (1) a fully dynamic 3D convolutional discriminator network with static generators, (2) a fully dynamic 3D convolutional discriminator and generator network, and (3) an architecture that computes "warp" between frames for input to a static generator. Additionally, to help enforce consistency, we experiment with an interframe classifier that verifies whether two frames belong to the same video sequence or not. We run experiments on a real-world dataset, presenting promising results in terms of FID, KID, and metrics developed from a classifier trained on our dataset.
AB - The ability to create and detect synthetic video is becoming critically important to scene understanding. Techniques for synthetic manipulation and augmentation of data increase diversity within available datasets, while not requiring laborious labeling efforts. That is, the ability to create synthetic video can enable augmentation of small realistic datasets on which to further train Artificial Intelligence and Machine Learning (AI/ML) algorithms. Thus, it may be desirable to add, remove, or modify vehicles in satellite and overhead video. In our previous work, we leveraged generative adversarial networks (GANs) to transform cars into trucks (and vice versa) in static images. We utilized an attention-based masking approach that assists the network in transformation of the object and not background. In addition, we demonstrated the benefits of numerous data augmentation procedures, including presenting a new artificial dataset of vehicles from an aerial perspective and introducing novel augmentation techniques appropriate for our network architectures. This work extends the applied techniques from still imagery to video. We employ a few different architectures: (1) a fully dynamic 3D convolutional discriminator network with static generators, (2) a fully dynamic 3D convolutional discriminator and generator network, and (3) an architecture that computes "warp" between frames for input to a static generator. Additionally, to help enforce consistency, we experiment with an interframe classifier that verifies whether two frames belong to the same video sequence or not. We run experiments on a real-world dataset, presenting promising results in terms of FID, KID, and metrics developed from a classifier trained on our dataset.
KW - GAN
KW - deep learning
KW - image translation
KW - inpainting
KW - video transformation
UR - https://www.scopus.com/pages/publications/85196489030
U2 - 10.1117/12.3013881
DO - 10.1117/12.3013881
M3 - Conference contribution
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Disruptive Technologies in Information Sciences VIII
A2 - Blowers, Misty
A2 - Wysocki, Bryant T.
PB - SPIE
T2 - Disruptive Technologies in Information Sciences VIII 2024
Y2 - 22 April 2024 through 25 April 2024
ER -