Free-Form Image Inpainting With Gated Convolution

We present a generative image inpainting system to complete images with free-form mask and guidance. The system is based on gated convolutions learned from millions of images without additional labelling efforts. The proposed gated convolution solves the issue of vanilla convolution that treats all input pixels as valid ones, generalizes partial convolution by providing a learnable dynamic feature selection mechanism for each channel at each spatial location across all layers. Moreover, as free-form masks may appear anywhere in images with any shape, global and local GANs designed for a single rectangular mask are not applicable. Thus, we also present a patch-based GAN loss, named SN-PatchGAN, by applying spectral-normalized discriminator on dense image patches. SN-PatchGAN is simple in formulation, fast and stable in training. Results on automatic image inpainting and user-guided extension demonstrate that our system generates higher-quality and more flexible results than previous methods. Our system helps user quickly remove distracting objects, modify image layouts, clear watermarks and edit faces. Code, demo and models are available at: \url{https://github.com/JiahuiYu/generative_inpainting}.

[1]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2]  Irfan Essa,et al.  Texture optimization for example-based synthesis , 2005, SIGGRAPH 2005.

[3]  Ning Xu,et al.  Wide Activation for Efficient and Accurate Image Super-Resolution , 2018, ArXiv.

[4]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[5]  Harry Shum,et al.  Image completion with structure propagation , 2005, ACM Trans. Graph..

[6]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Narendra Ahuja,et al.  Image completion using planar structure guidance , 2014, ACM Trans. Graph..

[8]  Andrew Zisserman,et al.  Get Out of my Picture! Internet-based Inpainting , 2009, BMVC.

[9]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[10]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[11]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[12]  Sunil Arya,et al.  ANN: library for approximate nearest neighbor searching , 1998 .

[13]  Qin Huang,et al.  SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting , 2018, BMVC.

[14]  Daniel Cohen-Or,et al.  Fragment-based image completion , 2003, ACM Trans. Graph..

[15]  Sung-Jea Ko,et al.  PEPSI++: Fast and Lightweight Network for Image Inpainting , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Wei Xiong,et al.  Foreground-Aware Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  In So Kweon,et al.  Deep Video Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[19]  Jianfei Cai,et al.  Pluralistic Image Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[21]  Michael Ashikhmin,et al.  Synthesizing natural textures , 2001, I3D '01.

[22]  Narendra Ahuja,et al.  Transformation guided image completion , 2013, IEEE International Conference on Computational Photography (ICCP).

[23]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Leif Kobbelt,et al.  Interactive image completion with perspective correction , 2006, The Visual Computer.

[28]  Jian Sun,et al.  Image Completion Approaches Using the Statistics of Similar Patches , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Eli Shechtman,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, ACM Trans. Graph..

[32]  Scott Cohen,et al.  Guided Image Inpainting: Replacing an Image Region by Pulling Content From Another Image , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[33]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[34]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[35]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[36]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[37]  Eli Shechtman,et al.  Image melding , 2012, ACM Trans. Graph..

[38]  Chao Yang,et al.  Contextual-Based Image Inpainting: Infer, Match, and Translate , 2017, ECCV.

[39]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Chao Yang,et al.  Image Inpainting using Block-wise Procedural Training with Annealed Adversarial Counterpart , 2018, ArXiv.

[41]  Sung-Jea Ko,et al.  PEPSI : Fast Image Inpainting With Parallel Decoding Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[43]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[44]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[45]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[46]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[48]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[49]  Denis Simakov,et al.  Summarizing visual data using bidirectional similarity , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Ying Wang,et al.  Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images , 2017, Remote. Sens..

[51]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Alexei A. Efros,et al.  Real-time user-guided image colorization with learned deep priors , 2017, ACM Trans. Graph..

[53]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.