论文信息 - RePaint: Inpainting using Denoising Diffusion Probabilistic Models

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask. Most existing approaches train for a certain distribution of masks, which limits their generalization capabilities to unseen mask types. Furthermore, training with pixel-wise and perceptual losses often leads to simple textural extensions towards the missing areas instead of semantically meaningful generation. In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks. We employ a pretrained unconditional DDPM as the generative prior. To condition the generation process, we only alter the reverse diffusion iterations by sampling the unmasked regions using the given image infor-mation. Since this technique does not modify or condition the original DDPM network itself, the model produces high-quality and diverse output images for any inpainting form. We validate our method for both faces and general-purpose image inpainting using standard and extreme masks. Re-Paint outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions. Github Repository: git.io/RePaint

[1] Jonathan Ho. Classifier-Free Diffusion Guidance , 2022, ArXiv.

[2] Prafulla Dhariwal,et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.

[3] David J. Fleet,et al. Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.

[4] Victor Lempitsky,et al. Resolution-robust Large Mask Inpainting with Fourier Convolutions , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[5] Hongyu Yang,et al. Image Inpainting via Conditional Texture and Structure Dual Generation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6] Youngjune Gwon,et al. ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[7] S. Ermon,et al. SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations , 2021, ICLR.

[8] Eric Luhman,et al. Denoising Synthesis: A module for fast image synthesis using denoising-based models , 2021, Softw. Impacts.

[9] Kun Gao,et al. NTIRE 2021 Learning the Super-Resolution Space Challenge , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[11] C. Miao,et al. Diverse Image Inpainting with Bidirectional and Autoregressive Transformers , 2021, ACM Multimedia.

[12] Baining Guo,et al. Aggregated Contextual Transformations for High-Resolution Image Inpainting , 2021, IEEE Transactions on Visualization and Computer Graphics.

[13] Zhiwei Xiong,et al. E2I: Generative Inpainting From Edge to Image , 2021, IEEE Transactions on Circuits and Systems for Video Technology.

[14] Jing Liao,et al. High-Fidelity Pluralistic Image Completion with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15] Dong Liu,et al. Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Shengyu Zhao,et al. Large Scale Image Completion via Co-Modulated Generative Adversarial Networks , 2021, ICLR.

[17] Prafulla Dhariwal,et al. Improved Denoising Diffusion Probabilistic Models , 2021, ICML.

[18] Eric Luhman,et al. Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed , 2021, ArXiv.

[19] Xiangyu Xu,et al. GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[21] Radu Timofte,et al. AIM 2020 Challenge on Image Extreme Inpainting , 2020, ECCV Workshops.

[22] Daniel Cohen-Or,et al. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[24] Lei Zhao,et al. UCTGAN: Diverse Image Inpainting Based on Unsupervised Cross-Space Translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Luc Van Gool,et al. SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing Objects , 2020, ECCV.

[26] C. Rudin,et al. PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Jie Li,et al. Image Fine-grained Inpainting , 2020, ArXiv.

[28] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Faisal Z. Qureshi,et al. EdgeConnect: Structure Guided Image Inpainting using Edge Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[30] Thomas H. Li,et al. StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31] Baining Guo,et al. Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Jianfei Cai,et al. Pluralistic Image Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Wei Xiong,et al. Foreground-Aware Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[36] Seunghoon Hong,et al. Learning Hierarchical Semantic Image Manipulation through Structured Representations , 2018, NeurIPS.

[37] Thomas S. Huang,et al. Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39] Ting-Chun Wang,et al. Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[40] Thomas S. Huang,et al. Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42] Andrea Vedaldi,et al. Deep Image Prior , 2017, International Journal of Computer Vision.

[43] Hiroshi Ishikawa,et al. Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[44] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[45] Alexei A. Efros,et al. Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[47] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[48] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[49] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[50] Aaron C. Courville,et al. Generative Adversarial Nets , 2014, NIPS.

[51] Alexei A. Efros,et al. Scene completion using millions of photographs , 2008, Commun. ACM.

[52] Guillermo Sapiro,et al. Simultaneous structure and texture image inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[53] Guillermo Sapiro,et al. Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[54] Guillermo Sapiro,et al. Image inpainting , 2000, SIGGRAPH.