Pluralistic Free-Form Image Completion

Image completion involves filling plausible contents to missing regions in images. Current image completion methods produce only one result for a given masked image, although there may be many reasonable possibilities. In this paper, we present an approach for pluralistic image completion—the task of generating multiple and diverse plausible solutions for free-form image completion. A major challenge faced by learning-based approaches is that usually only one ground truth training instance per label for this multi-output problem. To overcome this, we propose a novel and probabilistically principled framework with two parallel paths. One is a reconstructive path that utilizes the only one ground truth to get prior distribution of missing patches and rebuild the original image from this distribution. The other is a generative path for which the conditional prior is coupled to the distribution obtained in the reconstructive path. Both are supported by adversarial learning. We then introduce a new short+long term patch attention layer that exploits distant relations among decoder and encoder features, to improve appearance consistency between the original visible and the generated new regions. Experiments show that our method not only yields better results in various datasets than existing state-of-the-art methods, but also provides multiple and diverse outputs.

[1]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Faceshop , 2018, ACM Transactions on Graphics.

[3]  Wei Huang,et al.  Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations , 2020, ECCV.

[4]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[5]  Bernhard Schölkopf,et al.  Mask-Specific Inpainting with Deep Neural Networks , 2014, GCPR.

[6]  Jianfei Cai,et al.  Pluralistic Image Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  Shiguang Shan,et al.  Shift-Net: Image Inpainting via Deep Feature Rearrangement , 2018, ECCV.

[11]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Eli Shechtman,et al.  High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling , 2020, ECCV.

[13]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[15]  Koray Kavukcuoglu,et al.  Neural scene representation and rendering , 2018, Science.

[16]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[17]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[19]  Patrick Pérez,et al.  Object removal by exemplar-based inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[20]  Chi-Keung Tang,et al.  Inference of segmented color and texture description by tensor voting , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Li Xu,et al.  Shepard Convolutional Neural Networks , 2015, NIPS.

[22]  Assaf Zomet,et al.  Learning how to inpaint from global image statistics , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[23]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[24]  Matthias Zwicker,et al.  Faceshop , 2018, ACM Trans. Graph..

[25]  Jianfei Cai,et al.  T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks , 2018, ECCV.

[26]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[27]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[28]  Shaoliang Nie,et al.  High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks , 2018, ArXiv.

[29]  Chao Yang,et al.  Contextual-Based Image Inpainting: Infer, Match, and Translate , 2017, ECCV.

[30]  Derek Nowrouzezahrai,et al.  Learning hatching for pen-and-ink illustration of surfaces , 2012, TOGS.

[31]  Gang Hua,et al.  CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[35]  Dong Liu,et al.  Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[38]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[39]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[40]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Youngjoo Jo,et al.  SC-FEGAN: Face Editing Generative Adversarial Network With User’s Sketch and Color , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[43]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[46]  Yi Wang,et al.  Image Inpainting via Generative Multi-column Convolutional Neural Networks , 2018, NeurIPS.

[47]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Zhan Xu,et al.  Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[50]  Tali Dekel,et al.  SinGAN: Learning a Generative Model From a Single Natural Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[51]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[52]  Lei Zhao,et al.  UCTGAN: Diverse Image Inpainting Based on Unsupervised Cross-Space Translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[55]  Guillermo Sapiro,et al.  Simultaneous structure and texture image inpainting , 2003, IEEE Trans. Image Process..

[56]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Ye Deng,et al.  Image Inpainting Using Parallel Network , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[58]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[62]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.