Deep Face Video Inpainting via UV Mapping

This paper addresses the problem of face video inpainting. Existing video inpainting methods target primarily at natural scenes with repetitive patterns. They do not make use of any prior knowledge of the face to help retrieve correspondences for the corrupted face. They therefore only achieve sub-optimal results, particularly for faces under large pose and expression variations where face components appear very differently across frames. In this paper, we propose a two-stage deep learning method for face video inpainting. We employ 3DMM as our 3D face prior to transform a face between the image space and the UV (texture) space. In Stage I, we perform face inpainting in the UV space. This helps to largely remove the influence of face poses and expressions and makes the learning task much easier with well aligned face features. We introduce a frame-wise attention module to fully exploit correspondences in neighboring frames to assist the inpainting task. In Stage II, we transform the inpainted face regions back to the image space and perform face video refinement that inpaints any background regions not covered in Stage I and also refines the inpainted face regions. Extensive experiments have been carried out which show our method can significantly outperform methods based merely on 2D information, especially for faces under large pose and expression variations.

[1]  Christine Guillemot,et al.  Video Inpainting With Short-Term Windows: Application to Object Removal and Error Concealment , 2015, IEEE Transactions on Image Processing.

[2]  Guillermo Sapiro,et al.  Simultaneous structure and texture image inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3]  Hongyang Chao,et al.  Learning Joint Spatial-Temporal Transformations for Video Inpainting , 2020, ECCV.

[4]  In So Kweon,et al.  Deep Video Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[6]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[7]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[9]  Bolei Zhou,et al.  Deep Flow-Guided Video Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Shi-Min Hu,et al.  Deep Portrait Image Completion and Extrapolation , 2018, IEEE Transactions on Image Processing.

[11]  Xiaochun Cao,et al.  Face Super-Resolution Guided by 3D Facial Priors , 2020, ECCV.

[12]  Rui Zhang,et al.  Short-Term and Long-Term Context Aggregation Network for Video Inpainting , 2020, ECCV.

[13]  Seoung Wug Oh,et al.  Onion-Peel Networks for Deep Video Completion , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Baoyuan Wang,et al.  Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting , 2020, ECCV.

[15]  Guillermo Sapiro,et al.  Video inpainting of occluding and occluded objects , 2005, IEEE International Conference on Image Processing 2005.

[16]  Xiaojie Li,et al.  Domain Embedded Multi-model Generative Adversarial Networks for Image-based Face Inpainting , 2020, ArXiv.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Qin Huang,et al.  SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting , 2018, BMVC.

[19]  Narendra Ahuja,et al.  Temporally coherent completion of dynamic video , 2016, ACM Trans. Graph..

[20]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Baining Guo,et al.  Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Iacopo Masi,et al.  Does Generative Face Completion Help Face Recognition? , 2019, 2019 International Conference on Biometrics (ICB).

[23]  Eli Shechtman,et al.  Space-Time Completion of Video , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Seoung Wug Oh,et al.  Copy-and-Paste Networks for Deep Video Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Patrick Pérez,et al.  Video Inpainting of Complex Scenes , 2014, SIAM J. Imaging Sci..

[26]  Thomas S. Huang,et al.  Free-Form Image Inpainting With Gated Convolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Ruigang Yang,et al.  Identity Preserving Face Completion for Large Ocular Region Occlusion , 2018, BMVC.

[28]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[29]  Wei Xiong,et al.  Foreground-Aware Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Jiaolong Yang,et al.  Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[33]  Robert T. Schultz,et al.  Inequality-Constrained and Robust 3D Face Model Fitting , 2020, ECCV.

[34]  Winston H. Hsu,et al.  Free-Form Video Inpainting With 3D Gated Convolution and Temporal PatchGAN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Chen Gao,et al.  Flow-edge Guided Video Completion , 2020, ECCV.

[36]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Thabo Beeler,et al.  3D Morphable Face Models—Past, Present, and Future , 2020, ACM Trans. Graph..

[38]  Guillermo Sapiro,et al.  Video Inpainting Under Constrained Camera Motion , 2007, IEEE Transactions on Image Processing.

[39]  Yu Qiao,et al.  DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Sung-Jea Ko,et al.  PEPSI : Fast Image Inpainting With Parallel Decoding Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  In Kyu Park,et al.  Face De-Occlusion Using 3D Morphable Model and Generative Adversarial Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[43]  Meng Wang,et al.  Learning Symmetry Consistent Deep CNNs for Face Completion , 2018, IEEE Transactions on Image Processing.

[44]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[45]  John P. Lewis,et al.  Face Inpainting with Local Linear Representations , 2004, BMVC.

[46]  Ran He,et al.  Geometry-Aware Face Completion and Editing , 2018, AAAI.

[47]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[48]  Youngjoo Jo,et al.  SC-FEGAN: Face Editing Generative Adversarial Network With User’s Sketch and Color , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[50]  Matthias Zwicker,et al.  Faceshop , 2018, ACM Trans. Graph..

[51]  Winston H. Hsu,et al.  Learnable Gated Temporal Shift Module for Deep Video Inpainting , 2019 .

[52]  Seong-Whan Lee,et al.  Reconstruction of Partially Damaged Face Images Based on a Morphable Face Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Prabir Kumar Biswas,et al.  Improved Techniques for GAN based Facial Inpainting , 2018, ArXiv.

[54]  Jiebo Luo,et al.  Task-agnostic Temporally Consistent Facial Video Editing , 2020, ArXiv.

[55]  Stefanos Zafeiriou,et al.  The First Facial Landmark Tracking in-the-Wild Challenge: Benchmark and Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[56]  Yifan Wu,et al.  From Image to Video Face Inpainting: Spatial-Temporal Nested GAN (STN-GAN) for Usability Recovery , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[57]  Jiaolong Yang,et al.  Deep 3D Portrait From a Single Image , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[59]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Kun Zhou,et al.  Towards High-Fidelity 3D Face Reconstruction From In-the-Wild Images Using Graph Convolutional Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[62]  Bin Jiang,et al.  Coherent Semantic Attention for Image Inpainting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[63]  Yong Jae Lee,et al.  Progressive Temporal Feature Alignment Network for Video Inpainting , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[65]  Chuan Wang,et al.  Frame-Recurrent Video Inpainting by Robust Optical Flow Inference , 2019, ArXiv.

[66]  Jiaolong Yang,et al.  Face Video Deblurring Using 3D Facial Priors , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[67]  Stefanos Zafeiriou,et al.  UV-GAN: Adversarial Facial UV Map Completion for Pose-Invariant Face Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[68]  Guillermo Sapiro,et al.  Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..

[69]  Stefanos Zafeiriou,et al.  OSTeC: One-Shot Texture Completion , 2020, ArXiv.

[70]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[71]  Thomas H. Li,et al.  StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[72]  Chuan Wang,et al.  Video Inpainting by Jointly Learning Temporal Structure and Spatial Details , 2018, AAAI.

[73]  Xiaojie Guo,et al.  LaFIn: Generative Landmark Guided Face Inpainting , 2019, ArXiv.

[74]  Hao Li,et al.  High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Prem Kalra,et al.  MODEL BASED FACE RECONSTRUCTION FOR ANIMATION , 1999 .

[76]  Jan Kautz,et al.  Video-to-Video Synthesis , 2018, NeurIPS.