Structural Similarity Loss for Learning to Fuse Multi-Focus Images

Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘focused’ or ‘defocused’, and use the classified results to construct the fusion weight maps. This then necessitates a series of post-processing steps. In this paper, we present an end-to-end learning approach for directly predicting the fully focused output image from multi-focus input image pairs. The suggested approach uses a CNN architecture trained to perform fusion, without the need for ground truth fused images. The CNN exploits the image structural similarity (SSIM) to calculate the loss, a metric that is widely accepted for fused image quality evaluation. What is more, we also use the standard deviation of a local window of the image to automatically estimate the importance of the source images in the final fused image when designing the loss function. Our network can accept images of variable sizes and hence, we are able to utilize real benchmark datasets, instead of simulated ones, to train our network. The model is a feed-forward, fully convolutional neural network that can process images of variable sizes during test time. Extensive evaluation on benchmark datasets show that our method outperforms, or is comparable with, existing state-of-the-art techniques on both objective and subjective benchmarks.

[1]  Yu Liu,et al.  Multi-focus image fusion with dense SIFT , 2015, Inf. Fusion.

[2]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[3]  R. Venkatesh Babu,et al.  DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Pramod K. Varshney,et al.  A human perception inspired quality metric for image fusion based on regional information , 2007, Inf. Fusion.

[5]  Qiang Zhang,et al.  Multifocus image fusion using the nonsubsampled contourlet transform , 2009, Signal Process..

[6]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[7]  Yong Yang,et al.  Multilevel Features Convolutional Neural Network for Multifocus Image Fusion , 2019, IEEE Transactions on Computational Imaging.

[8]  Shadrokh Samavi,et al.  Multi-focus image fusion using dictionary-based sparse representation , 2015, Inf. Fusion.

[9]  Shutao Li,et al.  Pixel-level image fusion: A survey of the state of the art , 2017, Inf. Fusion.

[10]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Ananda S. Chowdhury,et al.  Steerable local frequency based multispectral multifocus image fusion , 2015, Inf. Fusion.

[12]  Xiaojie Guo,et al.  U2Fusion: A Unified Unsupervised Image Fusion Network , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[14]  Yu Han,et al.  A new image fusion performance metric based on visual information fidelity , 2013, Inf. Fusion.

[15]  Arif Mahmood,et al.  Multi-focus image fusion using Content Adaptive Blurring , 2019, Inf. Fusion.

[16]  W. Kong,et al.  Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization , 2014 .

[17]  Ming Dai,et al.  Multifocus color image fusion based on quaternion curvelet transform. , 2012, Optics express.

[18]  Narendra Ahuja,et al.  Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  LiShutao,et al.  Pixel-level image fusion , 2017 .

[20]  Ming-Hsuan Yang,et al.  Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks , 2017, NIPS.

[21]  Peng-wei Wang,et al.  A novel image fusion metric based on multi-scale analysis , 2008, 2008 9th International Conference on Signal Processing.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  David Bull,et al.  Image fusion metric based on mutual information and Tsallis entropy , 2006 .

[24]  Junjun Jiang,et al.  FusionDN: A Unified Densely Connected Network for Image Fusion , 2020, AAAI.

[25]  Nishan Canagarajah,et al.  A Similarity Metric for Assessment of Image Fusion Algorithms , 2008 .

[26]  Gaurav Bhatnagar,et al.  Mutual spectral residual approach for multifocus image fusion , 2013, Digit. Signal Process..

[27]  Yibo Chen,et al.  Robust Multi-Focus Image Fusion Using Edge Model and Multi-Matting , 2018, IEEE Transactions on Image Processing.

[28]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[29]  Yu Liu,et al.  Multi-focus image fusion with a deep convolutional neural network , 2017, Inf. Fusion.

[30]  Jinde Cao,et al.  Multi-Focus Image Fusion Using U-Shaped Networks With a Hybrid Objective , 2019, IEEE Sensors Journal.

[31]  Syed Zulqarnain Gilani,et al.  Learning from Millions of 3D Scans for Large-Scale 3D Face Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Huiming Tang,et al.  Pixel-wise regression using U-Net and its application on pansharpening , 2018, Neurocomputing.

[34]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Gonzalo Pajares,et al.  A wavelet-based image fusion tutorial , 2004, Pattern Recognit..

[36]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37]  Shutao Li,et al.  Image Fusion With Guided Filtering , 2013, IEEE Transactions on Image Processing.

[38]  Y. Asnath Victy Phamila,et al.  Discrete Cosine Transform based fusion of multi-focus images for visual sensor networks , 2014, Signal Process..

[39]  Yu Zhang,et al.  Boundary finding based multi-focus image fusion through multi-scale morphological focus-measure , 2017, Inf. Fusion.

[40]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[41]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Cedric Nishan Canagarajah,et al.  Image Fusion Using Complex Wavelets , 2002, BMVC.

[43]  Zheng Liu,et al.  Objective Assessment of Multiresolution Image Fusion Algorithms for Context Enhancement in Night Vision: A Comparative Study , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Shutao Li,et al.  Image matting for fusion of multi-focus images in dynamic scenes , 2013, Inf. Fusion.

[45]  Nikolaos Mitianoudis,et al.  Pixel-based and region-based image fusion schemes using ICA bases , 2007, Inf. Fusion.

[46]  Mei Yang,et al.  A novel algorithm of image fusion using shearlets , 2011 .

[47]  Han Xu,et al.  MFF-GAN: An unsupervised generative adversarial network with adaptive and gradient joint constraints for multi-focus image fusion , 2021, Inf. Fusion.

[48]  Shutao Li,et al.  Multifocus image fusion by combining curvelet and wavelet transform , 2008, Pattern Recognit. Lett..

[49]  Cedric Nishan Canagarajah,et al.  Pixel- and region-based image fusion with complex wavelets , 2007, Inf. Fusion.

[50]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Guoyin Wang,et al.  Pixel convolutional neural network for multi-focus image fusion , 2017, Inf. Sci..

[52]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[53]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Vladimir S. Petrovic,et al.  Gradient-based multiresolution image fusion , 2004, IEEE Transactions on Image Processing.

[56]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.