VIF-Net: An Unsupervised Framework for Infrared and Visible Image Fusion

Visible images provide abundant texture details and environmental information, while infrared images benefit from night-time visibility and suppression of highly dynamic regions; it is a meaningful task to fuse these two types of features from different sensors to generate an informative image. In this article, we propose an unsupervised end-to-end learning framework for infrared and visible image fusion. We first construct enough benchmark training datasets using the visible and infrared frames, which can address the limitation of the training dataset. Additionally, due to the lack of labeled datasets, our architecture is derived from a robust mixed loss function that consists of the modified structural similarity (M-SSIM) metric and the total variation (TV) by designing an unsupervised learning process that can adaptively fuse thermal radiation and texture details and suppress noise interference. In addition, our method is an end to end model, which avoids setting hand-crafted fusion rules and reducing computational cost. Furthermore, extensive experimental results demonstrate that the proposed architecture performs better than state-of-the-art methods in both subjective and objective evaluations.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Xiaojun Wu,et al.  Image Fusion With Contextual Statistical Similarity and Nonsubsampled Shearlet Transform , 2017, IEEE Sensors Journal.

[3]  Zheng Liu,et al.  A feature-based metric for the quantitative evaluation of pixel-level image fusion , 2008, Comput. Vis. Image Underst..

[4]  Toet Alexander,et al.  TNO Image Fusion Dataset , 2014 .

[5]  Hui Li,et al.  DenseFuse: A Fusion Approach to Infrared and Visible Images , 2018, IEEE Transactions on Image Processing.

[6]  Yang Chao,et al.  Efficient image fusion with approximate sparse representation , 2016, Int. J. Wavelets Multiresolution Inf. Process..

[7]  Zhifeng Gao,et al.  Fusion of infrared and visible images for night-vision context enhancement. , 2016, Applied optics.

[8]  Durga Prasad Bavirisetti,et al.  Fusion of Infrared and Visible Sensor Images Based on Anisotropic Diffusion and Karhunen-Loeve Transform , 2016, IEEE Sensors Journal.

[9]  Jiayi Ma,et al.  Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[10]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[11]  Jinde Cao,et al.  Infrared and visible images fusion using visual saliency and optimized spiking cortical model in non-subsampled shearlet transform domain , 2019, Multimedia Tools and Applications.

[12]  Jing Ma,et al.  Technique for Image Fusion Based on PCNN and Convolutional Neural Network , 2017, EIDWT.

[13]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.

[15]  Shutao Li,et al.  Pixel-level image fusion: A survey of the state of the art , 2017, Inf. Fusion.

[16]  Linping Li,et al.  Image fusion based on principal component analysis in dual-tree complex wavelet transform domain , 2012, 2012 International Conference on Wavelet Active Media Technology and Information Processing (ICWAMTIP).

[17]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[18]  Qiang Guo,et al.  An Adaptive Fusion Algorithm for Visible and Infrared Videos Based on Entropy and the Cumulative Distribution of Gray Levels , 2017, IEEE Transactions on Multimedia.

[19]  Yuanxi Peng,et al.  Infrared and visible image fusion based on robust principal component analysis and compressed sensing , 2018 .

[20]  M. Hossny,et al.  Comments on 'Information measure for performance of image fusion' , 2008 .

[21]  Tian Pu,et al.  Contrast-based image fusion using the discrete wavelet transform , 2000 .

[22]  Hui Li,et al.  Multi-focus Image Fusion Using Dictionary Learning and Low-Rank Representation , 2017, ICIG.

[23]  Guoyin Wang,et al.  Pixel convolutional neural network for multi-focus image fusion , 2017, Inf. Sci..

[24]  Long Wang,et al.  Multisensor video fusion based on spatial-temporal salience detection , 2013, Signal Process..

[25]  Nikolaos Mitianoudis,et al.  Pixel-based and region-based image fusion schemes using ICA bases , 2007, Inf. Fusion.

[26]  R. Venkatesh Babu,et al.  DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Shaowen Yao,et al.  A survey of infrared and visual image fusion methods , 2017 .

[28]  Esa Rahtu,et al.  Siamese network features for image matching , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[29]  Hua Zhao,et al.  Visible and infrared image fusion based on Curvelet transform , 2014, The 2014 2nd International Conference on Systems and Informatics (ICSAI 2014).

[30]  Alexander Toet,et al.  Image fusion by a ration of low-pass pyramid , 1989, Pattern Recognit. Lett..

[31]  Zhengfang Duanmu,et al.  Multi-Exposure Image Fusion by Optimizing A Structural Similarity Index , 2018, IEEE Transactions on Computational Imaging.

[32]  Jingkai Wang,et al.  Random Walks for Synthetic Aperture Radar Image Fusion in Framelet Domain , 2018, IEEE Transactions on Image Processing.

[33]  Minh N. Do,et al.  Ieee Transactions on Image Processing the Contourlet Transform: an Efficient Directional Multiresolution Image Representation , 2022 .

[34]  Vladimir S. Petrovic,et al.  Evaluation of Image Fusion Performance with Visible Differences , 2004, ECCV.

[35]  Baohua Zhang,et al.  The infrared and visible image fusion algorithm based on target separation and sparse representation , 2014 .

[36]  Quan Wang,et al.  Infrared and visible image fusion based on target extraction in the nonsubsampled contourlet transform domain , 2017 .

[37]  Li Chen,et al.  Multi-focus image fusion based on non-negative matrix factorization and difference images , 2014, Signal Process..

[38]  B. K. Shreyamsha Kumar,et al.  Image fusion based on pixel significance using cross bilateral filter , 2013, Signal, Image and Video Processing.

[39]  Yu Liu,et al.  Multi-focus image fusion with a deep convolutional neural network , 2017, Inf. Fusion.

[40]  Yang Lei,et al.  Visible and infrared image fusion using NSST and deep Boltzmann machine , 2018 .

[41]  Minh N. Do,et al.  The Nonsubsampled Contourlet Transform: Theory, Design, and Applications , 2006, IEEE Transactions on Image Processing.

[42]  Nasser Kehtarnavaz,et al.  Convolutional Autoencoder-Based Multispectral Image Fusion , 2019, IEEE Access.

[43]  P. Lions,et al.  Image recovery via total variation minimization and related problems , 1997 .

[44]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[45]  Long Wang,et al.  A novel video fusion framework using surfacelet transform , 2012 .

[46]  Josef Kittler,et al.  Infrared and Visible Image Fusion using a Deep Learning Framework , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[47]  Yi Shen,et al.  Performances evaluation of image fusion techniques based on nonlinear correlation measurement , 2004, Proceedings of the 21st IEEE Instrumentation and Measurement Technology Conference (IEEE Cat. No.04CH37510).

[48]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Gang Liu,et al.  Multi-sensor image fusion based on fourth order partial differential equations , 2017, 2017 20th International Conference on Information Fusion (Fusion).

[50]  Jiayi Ma,et al.  Infrared and visible image fusion via gradient transfer and total variation minimization , 2016, Inf. Fusion.