Converting Optical Videos to Infrared Videos Using Attention GAN and Its Impact on Target Detection and Classification Performance

To apply powerful deep-learning-based algorithms for object detection and classification in infrared videos, it is necessary to have more training data in order to build high-performance models. However, in many surveillance applications, one can have a lot more optical videos than infrared videos. This lack of IR video datasets can be mitigated if optical-to-infrared video conversion is possible. In this paper, we present a new approach for converting optical videos to infrared videos using deep learning. The basic idea is to focus on target areas using attention generative adversarial network (attention GAN), which will preserve the fidelity of target areas. The approach does not require paired images. The performance of the proposed attention GAN has been demonstrated using objective and subjective evaluations. Most importantly, the impact of attention GAN has been demonstrated in improved target detection and classification performance using real-infrared videos.

[1]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[2]  Gregory Shakhnarovich,et al.  Recurrent Back-Projection Network for Video Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Michael Felsberg,et al.  Channel Coded Distribution Field Tracking for Thermal Infrared Imagery , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[5]  Aggelos K. Katsaggelos,et al.  Video Super-Resolution With Convolutional Neural Networks , 2016, IEEE Transactions on Computational Imaging.

[6]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Zheng Liu,et al.  IR2VI: Enhanced Night Environmental Perception by Unsupervised Thermal Image Translation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[9]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jiang Li,et al.  Generative Adversarial Networks for Visible to Infrared Video Conversion , 2020 .

[11]  Christian Ledig,et al.  Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Pavan K. Turaga,et al.  Direct inference on compressive measurements using convolutional neural networks , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[13]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Vladimir V. Kniaz,et al.  THERMAL TEXTURE GENERATION AND 3D MODEL RECONSTRUCTION USING SFM AND GAN , 2018 .

[15]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  A. Enis Çetin,et al.  Co-difference based object tracking algorithm for infrared videos , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[18]  Xianming Liu,et al.  Robust Video Super-Resolution with Learned Temporal Dynamics , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Dimitris Kastaniotis,et al.  Attention-Aware Generative Adversarial Networks (ATA-GANs) , 2018, 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP).

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[22]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[23]  V. A. Mizginov,et al.  SYNTHETIC THERMAL BACKGROUND AND OBJECT TEXTURE GENERATION USING GEOMETRIC INFORMATION AND GAN , 2019 .

[24]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[25]  Henry Arguello,et al.  Object Detection on Compressive Measurements using Correlation Filters and Sparse Representation , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[26]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Matthew A. Brown,et al.  Frame-Recurrent Video Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Bülent Sankur,et al.  Compressively Sensed Image Recognition , 2018, 2018 7th European Workshop on Visual Information Processing (EUVIP).

[29]  Sergiu Nedevschi,et al.  Total variation regularization of local-global optical flow , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[30]  Vibhav Vineet,et al.  Privacy-Preserving Action Recognition Using Coded Aperture Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31]  Chao Gao,et al.  Background subtraction based level sets for human segmentation in thermal infrared surveillance systems , 2013 .

[32]  Chiman Kwan,et al.  A Comparative Study of Conventional and Deep Learning Target Tracking Algorithms for Low Quality Videos , 2018, ISNN.

[33]  Seoung Wug Oh,et al.  Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Rachael Abbott,et al.  Unsupervised object detection via LWIR/RGB translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Alexei A. Efros,et al.  Contrastive Learning for Unpaired Image-to-Image Translation , 2020, ECCV.

[36]  Angel Domingo Sappa,et al.  Infrared Image Colorization Based on a Triplet DCGAN Architecture , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Michael Elad,et al.  Compressed Learning: A Deep Neural Network Approach , 2016, ArXiv.

[38]  Vishal M. Patel,et al.  In2I: Unsupervised Multi-Image-to-Image Translation Using Generative Adversarial Networks , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[39]  Nicu Sebe,et al.  Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[40]  Ralph Etienne-Cummings,et al.  Real-Time and Deep Learning Based Vehicle Detection and Classification Using Pixel-Wise Code Exposure Measurements , 2020, Electronics.

[41]  Walter G. Kropatsch,et al.  ThermalGAN: Multimodal Color-to-Thermal Image Translation for Person Re-identification in Multispectral Dataset , 2018, ECCV Workshops.

[42]  Kevin Yu,et al.  Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks , 2019, Defense + Commercial Sensing.

[43]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[44]  Yibo Xu,et al.  Compressed Domain Image Classification Using a Dynamic-Rate Neural Network , 2020, IEEE Access.

[45]  Ralph Etienne-Cummings,et al.  Deep Learning-Based Target Tracking and Classification for Low Quality Videos Using Coded Aperture Cameras , 2019, Sensors.

[46]  Wei Xu,et al.  Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Angel D. Sappa,et al.  Colorizing Near Infrared Images Through a Cyclic Adversarial Approach of Unpaired Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[48]  Peter Reinartz,et al.  GENERATING ARTIFICIAL NEAR INFRARED SPECTRAL BAND FROM RGB IMAGE USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK , 2020 .

[49]  Michael Felsberg,et al.  Generating Visible Spectrum Images from Thermal Infrared , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50]  Subhashini Venugopalan,et al.  Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2014, NAACL.

[51]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[52]  Arthur Gretton,et al.  Demystifying MMD GANs , 2018, ICLR.

[53]  Ralph Etienne-Cummings,et al.  Target tracking and classification using compressive sensing camera for SWIR videos , 2019, Signal, Image and Video Processing.

[54]  Chenming Li,et al.  Detection and Tracking of Moving Targets for Thermal Infrared Video Sequences , 2018, Sensors.

[55]  Enrique Tajahuerce,et al.  Online reconstruction-free single-pixel image classification , 2019, Image Vis. Comput..

[56]  Fahad Shahbaz Khan,et al.  Synthetic Data Generation for End-to-End Thermal Infrared Tracking , 2018, IEEE Transactions on Image Processing.

[57]  Cheolkon Jung,et al.  NIR to RGB Domain Translation Using Asymmetric Cycle Generative Adversarial Networks , 2019, IEEE Access.

[58]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).