Masked Image Training for Generalizable Deep Image Denoising

When capturing and storing images, devices inevitably introduce noise. Reducing this noise is a critical task called image denoising. Deep learning has become the de facto method for image denoising, especially with the emergence of Transformer-based models that have achieved notable state-of-the-art results on various image tasks. However, deep learning-based methods often suffer from a lack of generalization ability. For example, deep models trained on Gaussian noise may perform poorly when tested on other noise distributions. To address this issue, we present a novel approach to enhance the generalization performance of denoising networks, known as masked training. Our method involves masking random pixels of the input image and reconstructing the missing information during training. We also mask out the features in the self-attention layers to avoid the impact of training-testing inconsistency. Our approach exhibits better generalization ability than other deep learning models and is directly applicable to real-world scenarios. Additionally, our interpretability analysis demonstrates the superiority of our method.

[1]  Yulun Zhang,et al.  Xformer: Hybrid X-Shaped Transformer for Image Denoising , 2023, ArXiv.

[2]  Yulun Zhang,et al.  Recursive Generalization Transformer for Image Super-Resolution , 2023, ArXiv.

[3]  Yulun Zhang,et al.  Cross Aggregation Transformer for Image Restoration , 2022, NeurIPS.

[4]  Yulun Zhang,et al.  Accurate Image Restoration with Attention Retractable Transformer , 2022, ICLR.

[5]  Chao Dong,et al.  Rethinking Alignment in Video Super-Resolution Transformers , 2022, NeurIPS.

[6]  Salma Abdel Magid,et al.  Texture-based Error Analysis for Image Super-Resolution , 2022, Computer Vision and Pattern Recognition.

[7]  Chao Dong,et al.  Evaluating the Generalization Ability of Super-Resolution Networks , 2022, ArXiv.

[8]  H. Jensen,et al.  Progressive Denoising of Monte Carlo Rendered Images , 2022, Comput. Graph. Forum.

[9]  L. Gool,et al.  Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis , 2022, ArXiv.

[10]  Chao Dong,et al.  Reflash Dropout in Image Super-Resolution , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Syed Waqas Zamir,et al.  Restormer: Efficient Transformer for High-Resolution Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Han Hu,et al.  SimMIM: a Simple Framework for Masked Image Modeling , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ross B. Girshick,et al.  Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  A. Dosovitskiy,et al.  Do Vision Transformers See Like Convolutional Neural Networks? , 2021, NeurIPS.

[15]  Yun Fu,et al.  Accurate and Fast Image Denoising via Attention Guided Scaling , 2021, IEEE Transactions on Image Processing.

[16]  Yihao Liu,et al.  Blind Image Super-Resolution: A Survey and Beyond , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Li Dong,et al.  BEiT: BERT Pre-Training of Image Transformers , 2021, ICLR.

[18]  Jianmin Bao,et al.  Uformer: A General U-Shaped Transformer for Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Haoqian Wang,et al.  Pseudo 3D Auto-Correlation Network for Real Image Denoising , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ling Shao,et al.  Multi-Stage Progressive Image Restoration , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Haoqiang Fan,et al.  NBNet: Noise Basis Learning for Image Denoising with Subspace Projection , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Wen Gao,et al.  Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Haoyu Chen,et al.  Image Quality Assessment for Perceptual Image Restoration: A New Dataset, Benchmark and Metric , 2020, ArXiv.

[24]  Chao Dong,et al.  Interpreting Super-Resolution Networks with Local Attribution Maps , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[26]  Haoyu Chen,et al.  PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image Restoration , 2020, ECCV.

[27]  Mark Chen,et al.  Generative Pretraining From Pixels , 2020, ICML.

[28]  Deyu Meng,et al.  Dual Adversarial Network: Toward Real-world Noise Removal and Noise Generation , 2020, ECCV.

[29]  Baining Guo,et al.  Learning Texture Transformer Network for Image Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[31]  Jiaolong Yang,et al.  A Physics-Based Noise Formation Model for Extreme Low-Light Raw Denoising , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Luc Van Gool,et al.  Self-Guided Network for Fast Image Denoising , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Deyu Meng,et al.  Variational Denoising Network: Toward Blind Noise Modeling and Removal , 2019, NeurIPS.

[34]  Yu Qiao,et al.  Suppressing Model Overfitting for Image Super-Resolution Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Xiangchu Feng,et al.  FOCNet: A Fractional Optimal Control Network for Image Denoising , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Geoffrey E. Hinton,et al.  Similarity of Neural Network Representations Revisited , 2019, ICML.

[37]  Nick Barnes,et al.  Real Image Denoising With Feature Attention , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Wangmeng Zuo,et al.  Blind Super-Resolution With Iterative Kernel Correction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Yun Fu,et al.  Residual Non-local Attention Networks for Image Restoration , 2019, ICLR.

[40]  Yun Fu,et al.  Residual Dense Network for Image Restoration , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Jonathan T. Barron,et al.  Unprocessing Images for Learned Raw Denoising , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Florian Jug,et al.  Noise2Void - Learning Denoising From Single Noisy Images , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Stefan Roth,et al.  Neural Nearest Neighbors Networks , 2018, NeurIPS.

[44]  Zhiwei Xiong,et al.  Deep Boosting for Image Denoising , 2018, ECCV.

[45]  Andrew Kensler,et al.  RenderMan , 2018, ACM Trans. Graph..

[46]  Clifford Stein,et al.  Sony Pictures Imageworks Arnold , 2018, ACM Trans. Graph..

[47]  Brent Burley,et al.  The Design and Evolution of Disney’s Hyperion Renderer , 2018, ACM Trans. Graph..

[48]  Wangmeng Zuo,et al.  Toward Convolutional Blind Denoising of Real Photographs , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Thomas S. Huang,et al.  Non-Local Recurrent Network for Image Restoration , 2018, NeurIPS.

[50]  Stephen Lin,et al.  A High-Quality Denoising Dataset for Smartphone Cameras , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Ming Yang,et al.  Image Blind Denoising with Generative Adversarial Network Based Noise Modeling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Stamatios Lefkimmiatis,et al.  Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks , 2018, ECCV.

[53]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Stamatios Lefkimmiatis,et al.  Universal Denoising Networks : A Novel CNN Architecture for Image Denoising , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Lei Zhang,et al.  FFDNet: Toward a Fast and Flexible Solution for CNN-Based Image Denoising , 2017, IEEE Transactions on Image Processing.

[56]  Jian Yang,et al.  MemNet: A Persistent Memory Network for Image Restoration , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[58]  Stefan Roth,et al.  Benchmarking Denoising Algorithms with Real Photographs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Luc Van Gool,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[60]  R. Venkatesh Babu,et al.  Image Denoising via CNNs: An Adversarial Approach , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[61]  Lei Zhang,et al.  Waterloo Exploration Database: New Challenges for Image Quality Assessment Models , 2017, IEEE Transactions on Image Processing.

[62]  Stamatios Lefkimmiatis,et al.  Non-local Color Image Denoising with Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[64]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Yu-Bin Yang,et al.  Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections , 2016, NIPS.

[66]  Yunjin Chen,et al.  Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Narendra Ahuja,et al.  Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[69]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization with Application to Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[71]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Lei Zhang,et al.  Color demosaicking by local directional interpolation and nonlocal adaptive thresholding , 2011, J. Electronic Imaging.

[73]  E. Lundberg,et al.  Towards a knowledge-based Human Protein Atlas , 2010, Nature Biotechnology.

[74]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[75]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[76]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[77]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[78]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[79]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[80]  Chao Dong,et al.  Discovering "Semantics" in Super-Resolution Networks , 2021, ArXiv.

[81]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[82]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[83]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .