Multigrained Attention Network for Infrared and Visible Image Fusion

Methods based on generative adversarial network (GAN) have been widely used in infrared and visible images fusion. However, these methods cannot perceive the discriminative parts of an image. Therefore, we introduce a multigrained attention module into encoder–decoder network to fuse infrared and visible images (MgAN-Fuse). The infrared and visible images are encoded by two independent encoder networks due to their diverse modalities. Then, the results of the two encoders are concatenated to calculate the fused result by the decoder. To exploit the features of multiscale layers fully and force the model focus on the discriminative regions, we integrate attention modules into multiscale layers of the encoder to obtain multigrained attention maps, and then, the multigrained attention maps are concatenated with the corresponding multiscale features of the decoder network. Thus, the proposed method can preserve the foreground target information of the infrared image and capture the context information of the visible image. Furthermore, we design an additional feature loss in the training process to preserve the important features of the visible image, and a dual adversarial architecture is employed to help the model capture enough infrared intensity information and visible details simultaneously. The ablation studies illustrate the validity of the multigrained attention network and feature loss function. Extensive experiments on two infrared and visible image data sets demonstrate that the proposed MgAN-Fuse has a better performance than state-of-the-art methods.

[1]  Wei Gao,et al.  Image fusion based on non-negative matrix factorization and infrared feature extraction , 2013, 2013 6th International Congress on Image and Signal Processing (CISP).

[2]  Rabab Kreidieh Ward,et al.  Deep learning for pixel-level image fusion: Recent advances and future prospects , 2018, Inf. Fusion.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Vps Naidu,et al.  Image Fusion Technique using Multi-resolution Singular Value Decomposition , 2011 .

[5]  Xiangtao Zheng,et al.  Spectral–Spatial Attention Network for Hyperspectral Image Classification , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Josef Kittler,et al.  Infrared and Visible Image Fusion using a Deep Learning Framework , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[7]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[8]  Shutao Li,et al.  Pixel-level image fusion: A survey of the state of the art , 2017, Inf. Fusion.

[9]  Weidong Sheng,et al.  A Stereo Attention Module for Stereo Image Super-Resolution , 2020, IEEE Signal Processing Letters.

[10]  Yu Han,et al.  A new image fusion performance metric based on visual information fidelity , 2013, Inf. Fusion.

[11]  Juan Cheng,et al.  EEG-Based Emotion Recognition via Channel-Wise Attention and Self Attention , 2023, IEEE Transactions on Affective Computing.

[12]  Yi Liu,et al.  Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: A review , 2018, Inf. Fusion.

[13]  G. Qu,et al.  Information measure for performance of image fusion , 2002 .

[14]  Nikolaos Mitianoudis,et al.  Pixel-based and region-based image fusion schemes using ICA bases , 2007, Inf. Fusion.

[15]  Luciano Alparone,et al.  Remote sensing image fusion using the curvelet transform , 2007, Inf. Fusion.

[16]  Yufeng Zheng,et al.  An advanced image fusion algorithm based on wavelet transform: incorporation with PCA and morphological processing , 2004, IS&T/SPIE Electronic Imaging.

[17]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[18]  Jun Huang,et al.  Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition. , 2017, Journal of the Optical Society of America. A, Optics, image science, and vision.

[19]  Jinsong Su,et al.  Neural Machine Translation with Deep Attention , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wing W. Y. Ng,et al.  Global-Local Mutual Attention Model for Text Classification , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[21]  Yong Yang,et al.  Infrared and Visible Image Fusion Using Visual Saliency Sparse Representation and Detail Injection Model , 2021, IEEE Transactions on Instrumentation and Measurement.

[22]  Jiayi Ma,et al.  Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[23]  R. S. Anand,et al.  Multimodal Medical Image Sensor Fusion Model Using Sparse K-SVD Dictionary Learning in Nonsubsampled Shearlet Domain , 2020, IEEE Transactions on Instrumentation and Measurement.

[24]  Rohan Ramanath,et al.  An Attentive Survey of Attention Models , 2019, ACM Trans. Intell. Syst. Technol..

[25]  Alexander Toet,et al.  Image fusion by a ration of low-pass pyramid , 1989, Pattern Recognit. Lett..

[26]  Hongtao Huo,et al.  Infrared and visible image fusion using dual discriminators generative adversarial networks with Wasserstein distance , 2020, Inf. Sci..

[27]  Jiayi Ma,et al.  Infrared and visible image fusion via gradient transfer and total variation minimization , 2016, Inf. Fusion.

[28]  Zhen Li,et al.  Coupled GAN With Relativistic Discriminators for Infrared and Visible Images Fusion , 2019, IEEE Sensors Journal.

[29]  Jing Li,et al.  Poisson Reconstruction-Based Fusion of Infrared and Visible Images via Saliency Detection , 2019, IEEE Access.

[30]  Yue Qi,et al.  Infrared and visible image fusion method based on saliency detection in sparse domain , 2017 .

[31]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Yi Chai,et al.  A novel multi-modality image fusion method based on image decomposition and sparse representation , 2017, Inf. Sci..

[33]  Qi Li,et al.  Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition , 2015 .

[34]  Hui Li,et al.  DenseFuse: A Fusion Approach to Infrared and Visible Images , 2018, IEEE Transactions on Image Processing.

[35]  J. Wesley Roberts,et al.  Assessment of image fusion procedures using entropy, image quality, and multispectral classification , 2008 .

[36]  Wei Yu,et al.  Learning a Generative Model for Fusing Infrared and Visible Images via Conditional Generative Adversarial Network with Dual Discriminators , 2019, IJCAI.

[37]  Laure J. Chipman,et al.  Wavelets and image fusion , 1995, Proceedings., International Conference on Image Processing.

[38]  Junjun Jiang,et al.  FusionGAN: A generative adversarial network for infrared and visible image fusion , 2019, Inf. Fusion.

[39]  Yu Liu,et al.  A general framework for image fusion based on multi-scale transform and sparse representation , 2015, Inf. Fusion.

[40]  Paul M. de Zeeuw,et al.  Fast saliency-aware multi-modality image fusion , 2013, Neurocomputing.

[41]  Hongtao Huo,et al.  AttentionFGAN: Infrared and Visible Image Fusion Using Attention-Based Generative Adversarial Networks , 2021, IEEE Transactions on Multimedia.

[42]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[43]  Xiaojie Guo,et al.  U2Fusion: A Unified Unsupervised Image Fusion Network , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Yu Liu,et al.  Infrared and visible image fusion with convolutional neural networks , 2017, Int. J. Wavelets Multiresolution Inf. Process..