A light-weight, efficient, and general cross-modal image fusion network

Abstract Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross-modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning characteristics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the difference of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times.

[1]  R. Venkatesh Babu,et al.  DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[3]  Yanning Zhang,et al.  AE-Net: Autonomous Evolution Image Fusion Method Inspired by Human Cognitive Mechanism , 2020, ArXiv.

[4]  Yu Zhang,et al.  Infrared and visual image fusion through infrared feature extraction and visual information preservation , 2017 .

[5]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[6]  Josef Kittler,et al.  Infrared and Visible Image Fusion using a Deep Learning Framework , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[7]  Alan C. Bovik,et al.  Image information and visual quality , 2006, IEEE Trans. Image Process..

[8]  Hui Li,et al.  MDLatLRR: A Novel Decomposition Method for Infrared and Visible Image Fusion , 2018, IEEE Transactions on Image Processing.

[9]  Xun Chen,et al.  Medical Image Fusion With Parameter-Adaptive Pulse Coupled Neural Network in Nonsubsampled Shearlet Transform Domain , 2019, IEEE Transactions on Instrumentation and Measurement.

[10]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[11]  Luciano Alparone,et al.  Remote sensing image fusion using the curvelet transform , 2007, Inf. Fusion.

[12]  David Summers,et al.  Harvard Whole Brain Atlas: www.med.harvard.edu/AANLIB/home.html , 2003 .

[13]  Pedro Alberto Morettin,et al.  Wavelet estimation of functional coefficient regression models , 2017, Int. J. Wavelets Multiresolution Inf. Process..

[14]  Durga Prasad Bavirisetti,et al.  Fusion of Infrared and Visible Sensor Images Based on Anisotropic Diffusion and Karhunen-Loeve Transform , 2016, IEEE Sensors Journal.

[15]  Junjun Jiang,et al.  FusionDN: A Unified Densely Connected Network for Image Fusion , 2020, AAAI.

[16]  Liangcun Jiang,et al.  A Flexible Reference-Insensitive Spatiotemporal Fusion Model for Remote Sensing Images Using Conditional Generative Adversarial Network , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Xingchen Zhang,et al.  Multi-focus Image Fusion: A Benchmark , 2020, ArXiv.

[18]  Chen Chen,et al.  Pan-GAN: An unsupervised pan-sharpening method for remote sensing image fusion , 2020, Inf. Fusion.

[19]  Yue Qi,et al.  Infrared and visible image fusion method based on saliency detection in sparse domain , 2017 .

[20]  Rabab Kreidieh Ward,et al.  Image Fusion With Convolutional Sparse Representation , 2016, IEEE Signal Processing Letters.

[21]  Hui Li,et al.  Infrared and Visible Image Fusion with ResNet and zero-phase component analysis , 2018, Infrared Physics & Technology.

[22]  Alan C. Bovik,et al.  Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[23]  Davide Cozzolino,et al.  Pansharpening by Convolutional Neural Networks , 2016, Remote. Sens..

[24]  Yide Ma,et al.  Medical image fusion using m-PCNN , 2008, Inf. Fusion.

[25]  Yu Liu,et al.  IFCNN: A general image fusion framework based on convolutional neural network , 2020, Inf. Fusion.

[26]  Jiayi Ma,et al.  Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[27]  B. K. Shreyamsha Kumar,et al.  Image fusion based on pixel significance using cross bilateral filter , 2013, Signal, Image and Video Processing.

[28]  Lei Zhang,et al.  Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images , 2018, IEEE Transactions on Image Processing.

[29]  Jiayi Ma,et al.  Infrared and visible image fusion via gradient transfer and total variation minimization , 2016, Inf. Fusion.

[30]  Jin Tang,et al.  RGB-T Object Tracking: Benchmark and Baseline , 2018, Pattern Recognit..

[31]  Hui Li,et al.  DenseFuse: A Fusion Approach to Infrared and Visible Images , 2018, IEEE Transactions on Image Processing.

[32]  Qi Li,et al.  Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition , 2015 .

[33]  Yu Liu,et al.  A medical image fusion method based on convolutional neural networks , 2017, 2017 20th International Conference on Information Fusion (Fusion).

[34]  Alexander Toet,et al.  Image fusion by a ration of low-pass pyramid , 1989, Pattern Recognit. Lett..

[35]  Xingchen Zhang Deep Learning-Based Multi-Focus Image Fusion: A Survey and a Comparative Study , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Gang Xiao,et al.  Multi-scale Guided Image and Video Fusion: A Fast and Efficient Approach , 2019, Circuits, Systems, and Signal Processing.

[37]  Junjun Jiang,et al.  FusionGAN: A generative adversarial network for infrared and visible image fusion , 2019, Inf. Fusion.

[38]  Yu Liu,et al.  A general framework for image fusion based on multi-scale transform and sparse representation , 2015, Inf. Fusion.

[39]  Hao Zhang,et al.  Rethinking the Image Fusion: A Fast Unified Image Fusion Network based on Proportional Maintenance of Gradient and Intensity , 2020, AAAI.

[40]  Yu Liu,et al.  Multi-focus image fusion with dense SIFT , 2015, Inf. Fusion.

[41]  Gang Liu,et al.  Multi-sensor image fusion based on fourth order partial differential equations , 2017, 2017 20th International Conference on Information Fusion (Fusion).

[42]  Yanning Zhang,et al.  Cross-modal image fusion guided by subjective visual attention , 2020, Neurocomputing.

[43]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Yiping Duan,et al.  Deep Coupled Feedback Network for Joint Exposure Fusion and Image Super-Resolution , 2021, IEEE Transactions on Image Processing.

[45]  Pier Luigi Dragotti,et al.  Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Yu Liu,et al.  Multi-focus image fusion with a deep convolutional neural network , 2017, Inf. Fusion.

[47]  A. Hegde,et al.  A Review of Quality Metrics for Fused Image , 2015 .

[48]  Yanning Zhang,et al.  A Cross-Modal Image Fusion Theory Guided by Human Visual Characteristics , 2019, ArXiv.

[49]  T. Durrani,et al.  NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models , 2020, IEEE Transactions on Instrumentation and Measurement.

[50]  Maya Cakmak,et al.  Visual Categorization with Random Projection , 2015, Neural Computation.

[51]  Hua Zong,et al.  Infrared and visible image fusion based on visual saliency map and weighted least square optimization , 2017 .

[52]  Haifeng Li,et al.  Dictionary learning method for joint sparse representation-based image fusion , 2013 .

[53]  Yu Han,et al.  A new image fusion performance metric based on visual information fidelity , 2013, Inf. Fusion.

[54]  Cedric Nishan Canagarajah,et al.  Pixel- and region-based image fusion with complex wavelets , 2007, Inf. Fusion.

[55]  Sabine Süsstrunk,et al.  Zero-Learning Fast Medical Image Fusion , 2019, 2019 22th International Conference on Information Fusion (FUSION).

[56]  W. Gan,et al.  Stably maintained dendritic spines are associated with lifelong memories , 2009, Nature.

[57]  Gang Xiao,et al.  VIFB: A Visible and Infrared Image Fusion Benchmark , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[58]  Wang Jian,et al.  A multi-source image fusion algorithm based on gradient regularized convolution sparse representation , 2020, Journal of Systems Engineering and Electronics.

[59]  Vps Naidu,et al.  Image Fusion Technique using Multi-resolution Singular Value Decomposition , 2011 .

[60]  E. Miller,et al.  An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[61]  Xingchen Zhang,et al.  Benchmarking and Comparing Multi-exposure Image Fusion Algorithms , 2020, Inf. Fusion.

[62]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[63]  Shadrokh Samavi,et al.  Multi-focus image fusion using dictionary-based sparse representation , 2015, Inf. Fusion.

[64]  Xiaojie Guo,et al.  U2Fusion: A Unified Unsupervised Image Fusion Network , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Wei Yu,et al.  Infrared and visible image fusion via detail preserving adversarial learning , 2020, Inf. Fusion.

[66]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[67]  Hui Li,et al.  Fast Multi-Scale Structural Patch Decomposition for Multi-Exposure Image Fusion , 2020, IEEE Transactions on Image Processing.

[68]  Yu Liu,et al.  Infrared and visible image fusion with convolutional neural networks , 2017, Int. J. Wavelets Multiresolution Inf. Process..