DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

Multi-modality image fusion aims to combine different modalities to produce fused images that retain the complementary features of each modality, such as functional highlights and texture details. To leverage strong generative priors and address challenges such as unstable training and lack of interpretability for GAN-based generative methods, we propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM). The fusion task is formulated as a conditional generation problem under the DDPM sampling framework, which is further divided into an unconditional generation subproblem and a maximum likelihood subproblem. The latter is modeled in a hierarchical Bayesian manner with latent variables and inferred by the expectation-maximization algorithm. By integrating the inference solution into the diffusion sampling iteration, our method can generate high-quality fused images with natural image generative priors and cross-modality information from source images. Note that all we required is an unconditional pre-trained generative model, and no fine-tuning is needed. Our extensive experiments indicate that our approach yields promising fusion results in infrared-visible image fusion and medical image fusion. The code will be released.

[1]  Yinhuai Wang,et al.  Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model , 2022, ICLR.

[2]  L. Gool,et al.  CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Michael T. McCann,et al.  Diffusion Posterior Sampling for General Noisy Inverse Problems , 2022, ICLR.

[4]  Jiayi Ma,et al.  RFNet: Unsupervised Network for Mutually Reinforcing Multi-modal Image Registration and Fusion , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Xin Fan,et al.  Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration , 2022, IJCAI.

[6]  Xin Fan,et al.  Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jiayi Ma,et al.  PIAFusion: A progressive infrared and visible image fusion network based on illumination aware , 2022, Inf. Fusion.

[8]  Junmin Liu,et al.  Efficient and Model-Based Infrared and Visible Image Fusion via Algorithm Unrolling , 2022, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Michael Elad,et al.  Denoising Diffusion Restoration Models , 2022, NeurIPS.

[10]  P. Dragotti,et al.  Multi-Modal Convolutional Dictionary Learning , 2022, IEEE Transactions on Image Processing.

[11]  Jiayi Ma,et al.  Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network , 2022, Inf. Fusion.

[12]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Karsten Kreis,et al.  Tackling the Generative Learning Trilemma with Denoising Diffusion GANs , 2021, ICLR.

[14]  Xingchen Zhang Deep Learning-Based Multi-Focus Image Fusion: A Survey and a Comparative Study , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jiangshe Zhang,et al.  Discrete Cosine Transform Network for Guided Depth Map Super-Resolution , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xiaojie Guo,et al.  U2Fusion: A Unified Unsupervised Image Fusion Network , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Junjun Jiang,et al.  Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion , 2022, ECCV.

[18]  Xin Fan,et al.  ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion , 2022, ECCV.

[19]  Xin Fan,et al.  Searching a Hierarchically Aggregated Fusion Architecture for Fast Multi-Modality Image Fusion , 2021, ACM Multimedia.

[20]  Jiayi Ma,et al.  SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion , 2021, International Journal of Computer Vision.

[21]  Prafulla Dhariwal,et al.  Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[22]  Junmin Liu,et al.  Deep Gradient Projection Networks for Pan-sharpening , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  J. Kittler,et al.  RFN-Nest: An end-to-end residual fusion network for infrared and visible images , 2021, Inf. Fusion.

[24]  Prafulla Dhariwal,et al.  Improved Denoising Diffusion Probabilistic Models , 2021, ICML.

[25]  Junmin Liu,et al.  FGF-GAN: A Lightweight Generative Adversarial Network for Pansharpening via Fast Guided Filter , 2020, 2021 IEEE International Conference on Multimedia and Expo (ICME).

[26]  Abhishek Kumar,et al.  Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[27]  Jiaming Song,et al.  Denoising Diffusion Implicit Models , 2020, ICLR.

[28]  Pier Luigi Dragotti,et al.  Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Han Xu,et al.  GANMcC: A Generative Adversarial Network With Multiclassification Constraints for Infrared and Visible Image Fusion , 2021, IEEE Transactions on Instrumentation and Measurement.

[30]  T. Durrani,et al.  NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models , 2020, IEEE Transactions on Instrumentation and Measurement.

[31]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[32]  Yicheng Wang,et al.  Deep Convolutional Sparse Coding Networks for Image Fusion , 2020, ArXiv.

[33]  Jiangshe Zhang,et al.  Bayesian Fusion for Infrared and Visible Images , 2020, Signal Process..

[34]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[35]  Hui Li,et al.  Fast Multi-Scale Structural Patch Decomposition for Multi-Exposure Image Fusion , 2020, IEEE Transactions on Image Processing.

[36]  Junjun Jiang,et al.  FusionDN: A Unified Densely Connected Network for Image Fusion , 2020, AAAI.

[37]  Hao Zhang,et al.  Rethinking the Image Fusion: A Fast Unified Image Fusion Network based on Proportional Maintenance of Gradient and Intensity , 2020, AAAI.

[38]  Jiangshe Zhang,et al.  DIDFuse: Deep Image Decomposition for Infrared and Visible Image Fusion , 2020, IJCAI.

[39]  Xiao-Ping Zhang,et al.  DDcGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion , 2020, IEEE Transactions on Image Processing.

[40]  Yu Liu,et al.  IFCNN: A general image fusion framework based on convolutional neural network , 2020, Inf. Fusion.

[41]  Wei Yu,et al.  Infrared and visible image fusion via detail preserving adversarial learning , 2020, Inf. Fusion.

[42]  Kwanghoon Sohn,et al.  Unsupervised Deep Image Fusion With Structure Tensor Representations , 2020, IEEE Transactions on Image Processing.

[43]  Junjun Jiang,et al.  FusionGAN: A generative adversarial network for infrared and visible image fusion , 2019, Inf. Fusion.

[44]  Ajith Abraham,et al.  A survey on region based image fusion methods , 2019, Inf. Fusion.

[45]  Yang Song,et al.  Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[46]  Chao Gao,et al.  BASNet: Boundary-Aware Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Hui Li,et al.  DenseFuse: A Fusion Approach to Infrared and Visible Images , 2018, IEEE Transactions on Image Processing.

[48]  Jiayi Ma,et al.  Infrared and visible image fusion methods and applications: A survey , 2018, Inf. Fusion.

[49]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Jiayi Ma,et al.  Infrared and visible image fusion via gradient transfer and total variation minimization , 2016, Inf. Fusion.

[51]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[52]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[53]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[54]  Belur V. Dasarathy,et al.  Medical Image Fusion: A survey of the state of the art , 2013, Inf. Fusion.

[55]  M. Hogervorst,et al.  Progress in color night vision , 2012 .

[56]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[57]  B. Anderson Reverse-time diffusion equation models , 1982 .