论文信息 - Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild

Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild

Diffusion models have shown promising results on single-image super-resolution and other image- to-image translation tasks. Despite this success, they have not outperformed state-of-the-art GAN models on the more challenging blind super-resolution task, where the input images are out of distribution, with unknown degradations. This paper introduces SR3+, a diffusion-based model for blind super-resolution, establishing a new state-of-the-art. To this end, we advocate self-supervised training with a combination of composite, parameterized degradations for self-supervised training, and noise-conditioing augmentation during training and testing. With these innovations, a large-scale convolutional architecture, and large-scale datasets, SR3+ greatly outperforms SR3. It outperforms Real-ESRGAN when trained on the same data, with a DRealSR FID score of 36.82 vs. 37.22, which further improves to FID of 32.37 with larger models, and further still with larger training sets.

David J. Fleet | Chitwan Saharia | Daniel Watson | Hshmat Sahak

[1] Yihao Liu,et al. Blind Image Super-Resolution: A Survey and Beyond , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] David J. Fleet,et al. Image Super-Resolution via Iterative Refinement , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Seung-Won Jung,et al. RZSR: Reference-Based Zero-Shot Super-Resolution With Depth Guided Self-Exemplars , 2022, IEEE Transactions on Multimedia.

[4] Jonathan Ho. Classifier-Free Diffusion Guidance , 2022, ArXiv.

[5] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.

[6] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] A. Dimakis,et al. Deblurring via Stochastic Refinement , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] David J. Fleet,et al. Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.

[9] David J. Fleet,et al. Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..

[10] Qi Li,et al. SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models , 2021, Neurocomputing.

[11] Changyou Chen,et al. Fine-Grained Attention and Feature-Sharing Generative Adversarial Networks for Single Image Super-Resolution , 2019, IEEE Transactions on Multimedia.

[12] Ying Shan,et al. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[13] Tieniu Tan,et al. End-to-end Alternating Optimization for Blind Super Resolution , 2021, ArXiv.

[14] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[15] Wei An,et al. Unsupervised Degradation Representation Learning for Blind Super-Resolution , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Luc Van Gool,et al. Flow-based Kernel Prior with Application to Blind Super-Resolution , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Luc Van Gool,et al. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[19] Hemant A. Patil,et al. CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion , 2020, 2020 28th European Signal Processing Conference (EUSIPCO).

[20] Changhu Wang,et al. Conditional Meta-Network for Blind Super-Resolution with Multiple Degradations , 2021, ArXiv.

[21] Wangmeng Zuo,et al. Component Divide-and-Conquer for Real-World Image Super-Resolution , 2020, ECCV.

[22] Truyen Tran,et al. Catastrophic forgetting and mode collapse in GANs , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[23] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[24] C. Rudin,et al. PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[26] Lei Zhang,et al. Toward Real-World Single Image Super-Resolution: A New Benchmark and a New Model , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27] Yun Fu,et al. Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[28] Kyung-Ah Sohn,et al. Fast, Accurate, and, Lightweight Super-Resolution with Cascading Residual Network , 2018, ECCV.

[29] Michal Irani,et al. "Zero-Shot" Super-Resolution Using Deep Internal Learning , 2017, CVPR.

[30] Jian Yang,et al. FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Eirikur Agustsson,et al. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32] Jae-Seok Choi,et al. A Deep Convolutional Neural Network with Selection Units for Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33] Kyoung Mu Lee,et al. Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34] Jian Yang,et al. Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[36] Narendra Ahuja,et al. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Mohammad Norouzi,et al. Pixel Recursive Super Resolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38] Kyoung Mu Lee,et al. Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[40] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[41] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.