Prior-Guided Multi-View 3D Head Reconstruction

Recovery of a 3D head model including the complete face and hair regions is still a challenging problem in computer vision and graphics. In this paper, we consider this problem using only a few multi-view portrait images as input. Previous multi-view stereo methods that have been based, either on optimization strategies or deep learning techniques, suffer from low-frequency geometric structures such as unclear head structures and inaccurate reconstruction in hair regions. To tackle this problem, we propose a prior-guided implicit neural rendering network. Specifically, we model the head geometry with a learnable signed distance field (SDF) and optimize it via an implicit differentiable renderer with the guidance of some human head priors, including the facial prior knowledge, head semantic segmentation information and 2D hair orientation maps. The utilization of these priors can improve the reconstruction accuracy and robustness, leading to a high-quality integrated 3D head model. Extensive ablation studies and comparisons with state-of-the-art methods demonstrate that our method can generate high-fidelity 3D head geometries with the guidance of these priors.

[1]  Anil K. Jain,et al.  Unsupervised texture segmentation using Gabor filters , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[2]  Frédo Durand,et al.  Bilateral Filtering: Theory and Applications: Series: Foundations and Trends® in Computer Graphics and Vision , 2009 .

[3]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[4]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[5]  Daniel Mayost Applications Of The Signed Distance Function To Surface Geometry , 2014 .

[6]  Anders Bjorholm Dahl,et al.  Large-Scale Data for Multiple-View Stereopsis , 2016, International Journal of Computer Vision.

[7]  Matthias Zwicker,et al.  SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Hao Su,et al.  Deep Stereo Using Adaptive Thin Volume Representation With Uncertainty Awareness , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Stefanos Zafeiriou,et al.  A 3D Morphable Model Learnt from 10,000 Faces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[11]  Wei Mao,et al.  Cost Volume Pyramid Based Depth Inference for Multi-View Stereo , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[13]  Alberto Del Bimbo,et al.  A Dictionary Learning-Based 3D Morphable Shape Model , 2017, IEEE Transactions on Multimedia.

[14]  Yongwei Nie,et al.  Data-driven 3D human head reconstruction , 2019, Comput. Graph..

[15]  Sylvain Paris,et al.  Capture of hair geometry from multiple images , 2004, ACM Trans. Graph..

[16]  King Ngi Ngan,et al.  MVF-Net: Multi-View 3D Face Morphable Model Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[18]  Gholamreza Anbarjafari,et al.  3D Face Reconstruction with Region Based Best Fit Blending Using Mobile Phone for Virtual Reality Based Social Media , 2017, ArXiv.

[19]  Zhaopeng Cui,et al.  Deep Facial Non-Rigid Multi-View Stereo , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jie Li,et al.  Fast and Adaptive 3D Reconstruction With Extensively High Completeness , 2017, IEEE Transactions on Multimedia.

[21]  Szymon Rusinkiewicz,et al.  Multi-view hair capture using orientation fields , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Baining Guo,et al.  3D cartoon face rigging from sparse examples , 2018, The Visual Computer.

[24]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[25]  Bailin Deng,et al.  3D Face Reconstruction With Geometry Details From a Single Image , 2017, IEEE Transactions on Image Processing.

[26]  Jianfei Cai,et al.  CNN-Based Real-Time Dense Face Reconstruction with Inverse-Rendered Photo-Realistic Face Images , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[28]  Bernhard Egger,et al.  A Morphable Face Albedo Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Yaser Sheikh,et al.  Strand-Accurate Multi-View Hair Capture , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Bernhard Egger,et al.  Morphable Face Models - An Open Framework , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[31]  Marc Erich Latoschik,et al.  Realistic Virtual Humans from Smartphone Videos , 2020, VRST.

[32]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[33]  Jing Xu,et al.  Point-Based Multi-View Stereo Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[35]  Bo Li,et al.  MVS2: Deep Unsupervised Multi-View Stereo with Multi-View Symmetry , 2019, 2019 International Conference on 3D Vision (3DV).

[36]  Jeffrey F. Cohn,et al.  The 2nd 3D Face Alignment in the Wild Challenge (3DFAW-Video): Dense Reconstruction From Video , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[37]  Long Quan,et al.  MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[38]  Tal Hassner,et al.  Extreme 3D Face Reconstruction: Seeing Through Occlusions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Simon Lucey,et al.  High Accuracy Face Geometry Capture using a Smartphone Video , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[40]  Harry Shum,et al.  Modeling hair from multiple views , 2005, ACM Trans. Graph..

[41]  John C. Hart,et al.  Sphere tracing: a geometric method for the antialiased ray tracing of implicit surfaces , 1996, The Visual Computer.

[42]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Ioannis A. Kakadiaris,et al.  Multi-view 3D face reconstruction with deep recurrent neural networks , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[45]  Bailin Deng,et al.  Lightweight Photometric Stereo for Facial Details Recovery , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Long Quan,et al.  Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Martial Hebert,et al.  Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency , 2019, ArXiv.

[48]  Lingyun Wu,et al.  MaskGAN: Towards Diverse and Interactive Facial Image Manipulation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Yang Zhao,et al.  3D Face Reconstruction from A Single Image Assisted by 2D Face Images in the Wild , 2019 .

[50]  Xin Li,et al.  Image-based Human Character Modeling and Reconstruction for Virtual Reality Exposure Therapy , 2018, 2018 13th International Conference on Computer Science & Education (ICCSE).

[51]  Hui Yu,et al.  Realistic Facial Expression Reconstruction for VR HMD Users , 2020, IEEE Transactions on Multimedia.

[52]  Yu-Wing Tai,et al.  Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking , 2020, ECCV.

[53]  Risheng Liu,et al.  Dual Neural Networks Coupling Data Regression With Explicit Priors for Monocular 3D Face Reconstruction , 2021, IEEE Transactions on Multimedia.

[54]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Carlos Hernandez,et al.  Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[56]  Ira Kemelmacher-Shlizerman,et al.  Video to fully automatic 3D hair model , 2018, ACM Trans. Graph..

[57]  I. Kakadiaris,et al.  Multi-view 3D face reconstruction with deep recurrent neural networks , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[58]  Carsten Rother,et al.  PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[59]  Konrad Schindler,et al.  Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[60]  Long Quan,et al.  BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Baichuan Huang,et al.  M3VSNET: Unsupervised Multi-Metric Multi-View Stereo Network , 2020, 2021 IEEE International Conference on Image Processing (ICIP).

[62]  Kenny Mitchell,et al.  Photo-Realistic Facial Details Synthesis From Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).