Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks

Infrared (IR) images are essential to improve the visibility of dark or camouflaged objects. Object recognition and segmentation based on a neural network using IR images provide more accuracy and insight than color visible images. But the bottleneck is the amount of relevant IR images for training. It is difficult to collect real-world IR images for special purposes, including space exploration, military and fire-fighting applications. To solve this problem, we created color visible and IR images using a Unity-based 3D game editor. These synthetically generated color visible and IR images were used to train cycle consistent adversarial networks (CycleGAN) to convert visible images to IR images. CycleGAN has the advantage that it does not require precisely matching visible and IR pairs for transformation training. In this study, we discovered that additional synthetic data can help improve CycleGAN performance. Neural network training using real data (N = 20) performed more accurate transformations than training using real (N = 10) and synthetic (N = 10) data combinations. The result indicates that the synthetic data cannot exceed the quality of the real data. Neural network training using real (N = 10) and synthetic (N = 100) data combinations showed almost the same performance as training using real data (N = 20). At least 10 times more synthetic data than real data is required to achieve the same performance. In summary, CycleGAN is used with synthetic data to improve the IR image conversion performance of visible images.

[1]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[2]  Francesca Bovolo,et al.  A Novel Technique Based on Deep Learning and a Synthetic Target Database for Classification of Urban Areas in PolSAR Data , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[4]  Cordelia Schmid,et al.  MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild , 2016, NIPS.

[5]  Nico Karssemeijer,et al.  Large scale deep learning for computer aided detection of mammographic lesions , 2017, Medical Image Anal..

[6]  Brooke R. Brisbois,et al.  Attention and Situational Awareness in First Responder Operations Guidance for the Design and Use of Wearable and Mobile Technologies , 2016 .

[7]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[8]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[10]  Wolfram Burgard,et al.  Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[12]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[13]  Thomas Lu,et al.  Deep Neural Networks for Pattern Recognition , 2018, ArXiv.

[14]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[15]  Luis Perez,et al.  The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[16]  Edward Chow,et al.  Occluded object reconstruction for first responders with augmented reality glasses using conditional generative adversarial networks , 2018, Defense + Security.

[17]  Xingrui Yu,et al.  Deep learning in remote sensing scene classification: a data augmentation enhanced convolutional neural network framework , 2017 .

[18]  K. Yun,et al.  Classification of Suicide Attempts through a Machine Learning Algorithm Based on Multiple Systemic Psychiatric Scales , 2017, Front. Psychiatry.

[19]  Thomas Lu,et al.  Predicting Rapid Fire Growth (Flashover) Using Conditional Generative Adversarial Networks , 2018, IRIACV.

[20]  Eduardo A. B. da Silva,et al.  A visible-light and infrared video database for performance evaluation of video/image fusion methods , 2019, Multidimens. Syst. Signal Process..

[21]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.