RFN-Nest: An end-to-end residual fusion network for infrared and visible images

Abstract In the image fusion field, the design of deep learning-based fusion methods is far from routine. It is invariably fusion-task specific and requires a careful consideration. The most difficult part of the design is to choose an appropriate strategy to generate the fused image for a specific task in hand. Thus, devising learnable fusion strategy is a very challenging problem in the community of image fusion. To address this problem, a novel end-to-end fusion network architecture (RFN-Nest) is developed for infrared and visible image fusion. We propose a residual fusion network (RFN) which is based on a residual architecture to replace the traditional fusion approach. A novel detail-preserving loss function, and a feature enhancing loss function are proposed to train RFN. The fusion model learning is accomplished by a novel two-stage training strategy. In the first stage, we train an auto-encoder based on an innovative nest connection (Nest) concept. Next, the RFN is trained using the proposed loss functions. The experimental results on public domain data sets show that, compared with the existing methods, our end-to-end fusion network delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation. The code of our fusion method is available at https://github.com/hli1221/imagefusion-rfn-nest .

[1]  R. Venkatesh Babu,et al.  DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Zhenyu He,et al.  The Seventh Visual Object Tracking VOT2019 Challenge Results , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[3]  J. Wesley Roberts,et al.  Assessment of image fusion procedures using entropy, image quality, and multispectral classification , 2008 .

[4]  Hui Li,et al.  Multi-focus Image Fusion Using Dictionary Learning and Low-Rank Representation , 2017, ICIG.

[5]  Hong Zhao,et al.  Image Fusion With Cosparse Analysis Operator , 2017, IEEE Signal Processing Letters.

[6]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Hui Li,et al.  MDLatLRR: A Novel Decomposition Method for Infrared and Visible Image Fusion , 2018, IEEE Transactions on Image Processing.

[8]  Jian Zhang,et al.  Learning Local-Global Multi-Graph Descriptors for RGB-T Object Tracking , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Nima Tajbakhsh,et al.  UNet++: A Nested U-Net Architecture for Medical Image Segmentation , 2018, DLMIA/ML-CDS@MICCAI.

[10]  Baohua Zhang,et al.  The infrared and visible image fusion algorithm based on target separation and sparse representation , 2014 .

[11]  Josef Kittler,et al.  Infrared and Visible Image Fusion using a Deep Learning Framework , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[12]  Jin Tang,et al.  RGBT Salient Object Detection: Benchmark and A Novel Cooperative Ranking Approach , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Hui Li,et al.  DenseFuse: A Fusion Approach to Infrared and Visible Images , 2018, IEEE Transactions on Image Processing.

[15]  Yue Qi,et al.  Infrared and visible image fusion method based on saliency detection in sparse domain , 2017 .

[16]  Philip H. S. Torr,et al.  The Eighth Visual Object Tracking VOT2020 Challenge Results , 2020, ECCV Workshops.

[17]  Shutao Li,et al.  Image Fusion With Guided Filtering , 2013, IEEE Transactions on Image Processing.

[18]  Junjun Jiang,et al.  FusionGAN: A generative adversarial network for infrared and visible image fusion , 2019, Inf. Fusion.

[19]  Hao Zhang,et al.  Rethinking the Image Fusion: A Fast Unified Image Fusion Network based on Proportional Maintenance of Gradient and Intensity , 2020, AAAI.

[20]  T. Durrani,et al.  NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models , 2020, IEEE Transactions on Instrumentation and Measurement.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Hui Cheng,et al.  Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking , 2016, IEEE Transactions on Image Processing.

[23]  N. Venkateswaran,et al.  IR and Visible Video Fusion for Surveillance , 2018, 2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET).

[24]  B. K. Shreyamsha Kumar Multifocus and multispectral image fusion based on pixel significance using discrete cosine harmonic wavelet transform , 2013 .

[25]  Shutao Li,et al.  Pixel-level image fusion: A survey of the state of the art , 2017, Inf. Fusion.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Gonzalo Pajares,et al.  A wavelet-based image fusion tutorial , 2004, Pattern Recognit..

[28]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[29]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[30]  Yu Liu,et al.  Multi-focus image fusion with a deep convolutional neural network , 2017, Inf. Fusion.

[31]  B. K. Shreyamsha Kumar,et al.  Multifocus and multispectral image fusion based on pixel significance using discrete cosine harmonic wavelet transform , 2013, Signal Image Video Process..

[32]  Namil Kim,et al.  Multispectral pedestrian detection: Benchmark dataset and baseline , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  V. Aslantaş,et al.  A new image quality metric for image fusion: The sum of the correlations of differences , 2015 .

[34]  Xu Song,et al.  Multi-focus Image Fusion with PCA Filters of PCANet , 2018, MPRSS.

[35]  Xiaojie Guo,et al.  U2Fusion: A Unified Unsupervised Image Fusion Network , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Wei Yu,et al.  Infrared and visible image fusion via detail preserving adversarial learning , 2020, Inf. Fusion.

[37]  Wei Liu,et al.  A novel infrared and visible image fusion algorithm based on shift-invariant dual-tree complex shearlet transform and sparse representation , 2017, Neurocomputing.

[38]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Yun He,et al.  A multiscale approach to pixel-level image fusion , 2005, Integr. Comput. Aided Eng..

[40]  Haifeng Li,et al.  Dictionary learning method for joint sparse representation-based image fusion , 2013 .

[41]  Tian Xia,et al.  RGB-T Image Saliency Detection via Collaborative Graph Learning , 2019, IEEE Transactions on Multimedia.

[42]  G. Qu,et al.  Information measure for performance of image fusion , 2002 .

[43]  Yu Liu,et al.  IFCNN: A general image fusion framework based on convolutional neural network , 2020, Inf. Fusion.

[44]  Jiayi Ma,et al.  Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[45]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Toet Alexander,et al.  TNO Image Fusion Dataset , 2014 .

[47]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[48]  Rabab Kreidieh Ward,et al.  Image Fusion With Convolutional Sparse Representation , 2016, IEEE Signal Processing Letters.

[49]  Bin Sun,et al.  Thermal infrared and visible sequences fusion tracking based on a hybrid tracking framework with adaptive weighting scheme , 2019, Infrared Physics & Technology.

[50]  Xiao-Ping Zhang,et al.  DDcGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion , 2020, IEEE Transactions on Image Processing.

[51]  Hui Li,et al.  Infrared and Visible Image Fusion with ResNet and zero-phase component analysis , 2018, Infrared Physics & Technology.

[52]  Kai Zeng,et al.  Perceptual Quality Assessment for Multi-Exposure Image Fusion , 2015, IEEE Transactions on Image Processing.

[53]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[54]  Rabab Kreidieh Ward,et al.  Deep learning for pixel-level image fusion: Recent advances and future prospects , 2018, Inf. Fusion.

[55]  Josef Kittler,et al.  AFAT: Adaptive Failure-Aware Tracker for Robust Visual Object Tracking , 2020, ArXiv.

[56]  Shuyuan Yang,et al.  Image fusion based on a new contourlet packet , 2010, Inf. Fusion.

[57]  Jiayi Ma,et al.  Infrared and visible image fusion via gradient transfer and total variation minimization , 2016, Inf. Fusion.

[58]  Jin Tang,et al.  RGB-T Object Tracking: Benchmark and Baseline , 2018, Pattern Recognit..

[59]  LiuWei,et al.  A novel infrared and visible image fusion algorithm based on shift-invariant dual-tree complex shearlet transform and sparse representation , 2017 .