Deep Fusion Local-Content and Global-Semantic for Image Inpainting

The upsampling layers are adopted in almost all the existing encoder-decoder based generative adversarial networks (GANs), which have shown promising results in the image inpainting field. However, existing upsampling layers (e.g. deconvolution and bilinear interpolation) suffer from two limitations: (1) they obtain few semantic information from the global structure. (2) upsampling layer could hardly capture the local content details. To eliminate the above issues, we propose a deep Fusion local-content and global-semantic (DFLG) model that is both effective and general. The DFLG model mainly consists of four components: the Local Content-Response (LCR) module, the pixel-shuffle operator, the Global Semantic-Aware (GSA) module and the reassembly module. Firstly, the LCR module divides the channel into several groups, then utilizes the squeeze-and-excitation mechanism within each group to capture the correlation between channels. Secondly, the pixel shuffle operator reshapes depth on the channel space into width and height on the spatial space, which transforms the correlation within groups on the channel space into correlation within patches on the spatial space. Next, the GSA module employs a patch-based spatial attention mechanism to calculate the correlation between different patches. Finally, the reassembly module refines the feature map. Furthermore, we propose a novel loss function called Attention Loss (ATLoss), which guides the network to concentrate on regions with obvious artifacts. The experiments on CelebA-HQ, Places2, and Paris StreetView datasets demonstrate the effectiveness of our proposed methods in image inpainting tasks and the capability of obtaining images with higher quality.

[1]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[3]  Chao Yang,et al.  Contextual-Based Image Inpainting: Infer, Match, and Translate , 2017, ECCV.

[4]  Nikos Komodakis,et al.  Image Completion Using Efficient Belief Propagation Via Priority Scheduling and Dynamic Pruning , 2007, IEEE Transactions on Image Processing.

[5]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Nipun Kwatra,et al.  Texture optimization for example-based synthesis , 2005, ACM Trans. Graph..

[7]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[8]  Roberto Cipolla,et al.  Hole Filling Through Photomontage , 2005, BMVC.

[9]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[11]  Jianhong Shen,et al.  Digital inpainting based on the Mumford–Shah–Euler image model , 2002, European Journal of Applied Mathematics.

[12]  Denis Simakov,et al.  Summarizing visual data using bidirectional similarity , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Lei Zhang,et al.  Image Inpainting for Object Removal Based on Adaptive Two-Round Search Strategy , 2020, IEEE Access.

[14]  Shmuel Peleg,et al.  Seamless Image Stitching in the Gradient Domain , 2004, ECCV.

[15]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[16]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Alexei A. Efros,et al.  Texture synthesis by non-parametric sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[19]  Xudong Cui,et al.  Image Inpainting Based on Inside–Outside Attention and Wavelet Decomposition , 2020, IEEE Access.

[20]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[21]  Baining Guo,et al.  Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[23]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[24]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Weiwei Cai,et al.  Diversity-Generated Image Inpainting with Style Extraction , 2019, ArXiv.

[26]  Kai Chen,et al.  CARAFE: Content-Aware ReAssembly of FEatures , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[28]  Thomas H. Li,et al.  StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Shiguang Shan,et al.  Shift-Net: Image Inpainting via Deep Feature Rearrangement , 2018, ECCV.

[30]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[31]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[32]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[33]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[34]  Bin Jiang,et al.  Coherent Semantic Attention for Image Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Wei Xiong,et al.  Foreground-Aware Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[37]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Simo-SerraEdgar,et al.  Globally and locally consistent image completion , 2017 .