Multistage attention network for image inpainting

Abstract Image inpainting refers to the process of restoring the mask regions of damaged images. Existing inpainting algorithms have exhibited outstanding performance on certain inpainting tasks that are focused on recovering small masks or square masks. Tasks that attempt to reconstruct large proportion of damaged images can still be improved. Although many attention-related algorithms have been proposed to solve image inpainting tasks, most of them ignore the requirements to balancing the detail and style level. In this paper, we propose a novel image inpainting method for large-scale irregular masks. We introduce a special multistage attention module that considers structure consistency and detail fineness. The proposed multistage attention module operates in a coarse to-fine manner, where the early stage performs large feature patch swapping and ensures the global consistency in images, and the next stage swaps small patches to refine the texture. Then, we adopt a partial convolution strategy to avoid the misuse of invalid data during convolution. Several losses are combined as the training objective function to generate excellent results with global consistency and exquisite detail. Qualitative and quantitative experiments on the Paris StreetView, CelebA, and Places2 datasets demonstrate the superior performance of the proposed approach compared with state-of-the-art models.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Shiguang Shan,et al.  Shift-Net: Image Inpainting via Deep Feature Rearrangement , 2018, ECCV.

[3]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Guillermo Sapiro,et al.  Navier-stokes, fluid dynamics, and image and video inpainting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Jie Chen,et al.  Attention on Attention for Image Captioning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Harald Grossauer,et al.  A Combined PDE and Texture Synthesis Approach to Inpainting , 2004, ECCV.

[7]  Wensheng Zhang,et al.  The Twist Tensor Nuclear Norm for Video Completion , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Tieyong Zeng,et al.  Explicit Coherence Enhancing Filter With Spatial Adaptive Elliptical Kernel , 2012, IEEE Signal Processing Letters.

[9]  Bo Du,et al.  Progressive Reconstruction of Visual Structure for Image Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Cristian Canton-Ferrer,et al.  Eye In-painting with Exemplar Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Xiang Ji,et al.  Representing and Retrieving Video Shots in Human-Centric Brain Imaging Space , 2013, IEEE Transactions on Image Processing.

[13]  Thomas H. Li,et al.  StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Tony F. Chan,et al.  Nontexture Inpainting by Curvature-Driven Diffusions , 2001, J. Vis. Commun. Image Represent..

[16]  Baining Guo,et al.  Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yunchao Wei,et al.  CCNet: Criss-Cross Attention for Semantic Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  Cheng Wang,et al.  Unsupervised Domain Adaptive Re-Identification: Theory and Practice , 2018, Pattern Recognit..

[21]  Faisal Z. Qureshi,et al.  EdgeConnect: Structure Guided Image Inpainting using Edge Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[22]  Bin Jiang,et al.  Coherent Semantic Attention for Image Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Shuenn-Shyang Wang,et al.  Automatic image authentication and recovery using fractal code embedding and image inpainting , 2008, Pattern Recognit..

[24]  Huchuan Lu,et al.  Multi attention module for visual tracking , 2019, Pattern Recognit..

[25]  Lei Wang,et al.  Coarse-to-Fine Image Inpainting via Region-wise Convolutions and Non-Local Correlation , 2019, IJCAI.

[26]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[27]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[28]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jianfei Cai,et al.  Pluralistic Image Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jeffrey J. Rodriguez,et al.  Perceptually aware image inpainting , 2018, Pattern Recognit..

[31]  Wei Xiong,et al.  Foreground-Aware Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Christine Guillemot,et al.  Image Inpainting : Overview and Recent Advances , 2014, IEEE Signal Processing Magazine.

[34]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[35]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[36]  Luc Van Gool,et al.  Natural and Effective Obfuscation by Head Inpainting , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Bo Du,et al.  Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding , 2015, Pattern Recognit..

[38]  Jeffrey J. Rodríguez,et al.  Image Inpainting Using Nonlocal Texture Matching and Nonlinear Filtering , 2019, IEEE Transactions on Image Processing.

[39]  Wangmeng Zuo,et al.  Image Inpainting With Learnable Bidirectional Attention Maps , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Bo Du,et al.  MUSICAL: Multi-Scale Image Contextual Attention Learning for Inpainting , 2019, IJCAI.

[41]  Weiwei Liu,et al.  Metric Learning for Multi-Output Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Guillermo Sapiro,et al.  Simultaneous structure and texture image inpainting , 2003, IEEE Trans. Image Process..

[43]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[44]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Patrick Pérez,et al.  Video Inpainting of Complex Scenes , 2014, SIAM J. Imaging Sci..

[46]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[47]  Wen Gao,et al.  Attention Driven Person Re-identification , 2018, Pattern Recognit..