StructureFlow: Image Inpainting via Structure-Aware Appearance Flow

Image inpainting techniques have shown significant improvements by using deep neural networks recently. However, most of them may either fail to reconstruct reasonable structures or restore fine-grained textures. In order to solve this problem, in this paper, we propose a two-stage model which splits the inpainting task into two parts: structure reconstruction and texture generation. In the first stage, edge-preserved smooth images are employed to train a structure reconstructor which completes the missing structures of the inputs. In the second stage, based on the reconstructed structures, a texture generator using appearance flow is designed to yield image details. Experiments on multiple publicly available datasets show the superior performance of the proposed network.

[1]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Eli Shechtman,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, ACM Trans. Graph..

[3]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[4]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[6]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[10]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[11]  Jan Kautz,et al.  Video-to-Video Synthesis , 2018, NeurIPS.

[12]  Xiaoou Tang,et al.  Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[14]  Eli Shechtman,et al.  Image melding , 2012, ACM Trans. Graph..

[15]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Denis Simakov,et al.  Summarizing visual data using bidirectional similarity , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Cewu Lu,et al.  Image smoothing via L0 gradient minimization , 2011, ACM Trans. Graph..

[19]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Qin Huang,et al.  SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting , 2018, BMVC.

[23]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[24]  Michael J. Black,et al.  Learning Optical Flow , 2008, ECCV.

[25]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Stefan Roth,et al.  UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss , 2017, AAAI.

[27]  Li Xu,et al.  Structure extraction from texture via relative total variation , 2012, ACM Trans. Graph..

[28]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[29]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[30]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[32]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[33]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[35]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).