Semantic Image Inpainting with Progressive Generative Networks

Recently, image inpainting task has revived with the help of deep learning techniques. Deep neural networks, especially the generative adversarial networks~(GANs) make it possible to recover the missing details in images. Due to the lack of sufficient context information, most existing methods fail to get satisfactory inpainting results. This work investigates a more challenging problem, e.g., the newly-emerging semantic image inpainting - a task to fill in large holes in natural images. In this paper, we propose an end-to-end framework named progressive generative networks~(PGN), which regards the semantic image inpainting task as a curriculum learning problem. Specifically, we divide the hole filling process into several different phases and each phase aims to finish a course of the entire curriculum. After that, an LSTM framework is used to string all the phases together. By introducing this learning strategy, our approach is able to progressively shrink the large corrupted regions in natural images and yields promising inpainting results. Moreover, the proposed approach is quite fast to evaluate as the entire hole filling is performed in a single forward pass. Extensive experiments on Paris Street View and ImageNet dataset clearly demonstrate the superiority of our approach. Code for our models is available at https://github.com/crashmoon/Progressive-Generative-Networks.

[1]  Raymond H. Chan,et al.  A Primal–Dual Method for Total-Variation-Based Wavelet Domain Inpainting , 2012, IEEE Transactions on Image Processing.

[2]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[5]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[6]  Guillermo Sapiro,et al.  Simultaneous structure and texture image inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Daniel Cohen-Or,et al.  Fragment-based image completion , 2003, ACM Trans. Graph..

[8]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[9]  Joachim Weickert,et al.  Coherence-Enhancing Diffusion Filtering , 1999, International Journal of Computer Vision.

[10]  Zongben Xu,et al.  Image Inpainting by Patch Propagation Using Patch Sparsity , 2010, IEEE Transactions on Image Processing.

[11]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Christine Guillemot,et al.  Examplar-based inpainting based on local geometry , 2011, 2011 18th IEEE International Conference on Image Processing.

[13]  Assaf Zomet,et al.  Learning how to inpaint from global image statistics , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[21]  Pineda,et al.  Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[25]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[26]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[29]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[30]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Harry Shum,et al.  Image completion with structure propagation , 2005, ACM Trans. Graph..

[32]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Christopher D. Manning,et al.  Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.