Attentional coarse-and-fine generative adversarial networks for image inpainting

Abstract Deep learning based image inpainting methods have recently made significant improvements, which can now generate visually realistic and semantically plausible pixels for large missing regions. However, blurry textures and distorted edges are the two main problems of these methods. In nature, they are caused by the ineffectiveness of utilizing features. Thus, we propose a coarse-and-fine structure, which can extract coarse features with a larger receptive field to guarantee the accuracy of global semantic information, and can simultaneously extract fine features with a smaller receptive field at multiple levels to serve as a supplement in a parallel manner. To avoid the artifacts caused by the missing part of image, we flexibly use the gating mechanism and propose an interleaved gated residual block (IGRB) to encourage the useful information flow through the neural network. Moreover, we propose a channel and spatial attention block (CSAB) to alleviate the influence of redundant information and better model the long-range dependency between different regions in the image. Extensive experiments on faces, natural objects and scenes demonstrate that our method outperforms the existing state-of-the-art methods.

[1]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[2]  Eli Shechtman,et al.  Image melding , 2012, ACM Trans. Graph..

[3]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[4]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[5]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[6]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[8]  Narendra Ahuja,et al.  Image completion using planar structure guidance , 2014, ACM Trans. Graph..

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[11]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[12]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[13]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[17]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jianfei Cai,et al.  Pluralistic Image Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Wei Xiong,et al.  Foreground-Aware Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[22]  Shiguang Shan,et al.  Shift-Net: Image Inpainting via Deep Feature Rearrangement , 2018, ECCV.

[23]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[24]  Jiaya Jia,et al.  Image completion with structure propagation , 2005, SIGGRAPH 2005.

[25]  Baining Guo,et al.  Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[28]  Jianhong Shen,et al.  Digital inpainting based on the Mumford–Shah–Euler image model , 2002, European Journal of Applied Mathematics.

[29]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.