S2F2: Self-Supervised High Fidelity Face Reconstruction from Monocular Image

We present a novel face reconstruction method capable of reconstructing detailed face geometry, spatially varying face reflectance from a single monocular image. We build our work upon the recent advances of DNN-based auto-encoders with differentiable ray tracing image formation, trained in self-supervised manner. While providing the advantage of learning-based approaches and real-time reconstruction, the latter methods lacked fidelity. In this work, we achieve, for the first time, high fidelity face reconstruction using self-supervised learning only. Our novel coarse-to-fine deep architecture allows us to solve the challenging problem of decoupling face reflectance from geometry using a single image, at high computational speed. Compared to state-of-the-art methods, our method achieves more visually appealing reconstruction.

[1]  Christian Theobalt,et al.  Efficient and Differentiable Shadow Computation for Inverse Problems , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Louis Chevallier,et al.  Towards High Fidelity Monocular Face Reconstruction with Rich Reflectance using Self-supervised Learning and Ray Tracing , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Louis Chevallier,et al.  Practical Face Reconstruction via Differentiable Ray Tracing , 2021, Comput. Graph. Forum.

[4]  Zhen Lei,et al.  Towards Fast, Accurate and Stable 3D Dense Face Alignment , 2020, ECCV.

[5]  William A. P. Smith,et al.  "Look Ma, No Landmarks!" - Unsupervised, Model-Based Dense Face Alignment , 2020, ECCV.

[6]  Louis Chevallier,et al.  FaceLab: Scalable Facial Performance Capture for Visual Effects , 2020, DigiPro.

[7]  Thabo Beeler,et al.  Single-shot high-quality facial geometry and skin appearance capture , 2020, ACM Trans. Graph..

[8]  Yun-Ta Tsai,et al.  Portrait shadow manipulation , 2020, ACM Trans. Graph..

[9]  Hannah M. Dee,et al.  A Morphable Face Albedo Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ruigang Yang,et al.  FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Stefanos Zafeiriou,et al.  AvatarMe: Realistically Renderable 3D Facial Reconstruction “In-the-Wild” , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Philip H. S. Torr,et al.  Cross-Modal Deep Face Normals With Deactivable Skip Connections , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Eimear O' Sullivan,et al.  Towards a Complete 3D Morphable Model of the Human Head , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Wenzel Jakob,et al.  Reparameterizing discontinuous integrands for differentiable rendering , 2019, ACM Trans. Graph..

[15]  Louis Chevallier,et al.  Face Reflectance and Geometry Modeling via Differentiable Ray Tracing , 2019, ArXiv.

[16]  Yu Qiao,et al.  DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  T. Vetter,et al.  3D Morphable Face Models—Past, Present, and Future , 2019, ACM Trans. Graph..

[18]  Michael J. Black,et al.  Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Feng Liu,et al.  Towards High-Fidelity Nonlinear 3D Face Morphable Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Kenny Mitchell,et al.  Photo-Realistic Facial Details Synthesis From Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Jiashi Feng,et al.  Joint 3D Face Reconstruction and Dense Face Alignment from A Single Image with 2D-Assisted Self-Supervised Learning , 2019, ArXiv.

[22]  Jiaolong Yang,et al.  Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Hans-Peter Seidel,et al.  FML: Face Model Learning From Videos , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jaakko Lehtinen,et al.  Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[25]  Derek Bradley,et al.  Practical dynamic facial appearance modeling and acquisition , 2018, ACM Trans. Graph..

[26]  Shigeo Morishima,et al.  High-fidelity facial reflectance and geometry inference from an unconstrained image , 2018, ACM Trans. Graph..

[27]  Andrew Jones,et al.  Mesoscopic Facial Geometry Inference Using Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Yaser Sheikh,et al.  Modeling Facial Geometry Using Compositional VAEs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Patrick Pérez,et al.  State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications , 2018, Comput. Graph. Forum.

[30]  Xiaoming Liu,et al.  Nonlinear 3D Face Morphable Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Tal Hassner,et al.  Extreme 3D Face Reconstruction: Seeing Through Occlusions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  M. Zollhöfer,et al.  Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[35]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[36]  Bernhard Egger,et al.  Morphable Face Models - An Open Framework , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[37]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Ron Kimmel,et al.  Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Bailin Deng,et al.  3D Face Reconstruction With Geometry Details From a Single Image , 2017, IEEE Transactions on Image Processing.

[41]  Matan Sela,et al.  Learning Detailed Face Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Derek Bradley,et al.  An anatomically-constrained local deformation model for monocular face capture , 2016, ACM Trans. Graph..

[43]  Justus Thies,et al.  Face2Face: Real-Time Face Capture and Reenactment of RGB Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..

[45]  Thabo Beeler,et al.  Real-time high-fidelity facial performance capture , 2015, ACM Trans. Graph..

[46]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[47]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[48]  Christian Theobalt,et al.  Reconstructing detailed dynamic face geometry from monocular video , 2013, ACM Trans. Graph..

[49]  Hans-Peter Seidel,et al.  Lightweight binocular facial performance capture under uncontrolled lighting , 2012, ACM Trans. Graph..

[50]  Alberto Del Bimbo,et al.  Superfaces: A Super-Resolution Model for 3D Faces , 2012, ECCV Workshops.

[51]  Paul E. Debevec,et al.  Multiview face capture using polarized spherical gradient illumination , 2011, ACM Trans. Graph..

[52]  Hans-Peter Seidel,et al.  Shading-based dynamic shape refinement from multi-view video under general illumination , 2011, 2011 International Conference on Computer Vision.

[53]  Derek Bradley,et al.  High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..

[54]  Ravi Ramamoorthi,et al.  A Theory Of Frequency Domain Invariants: Spherical Harmonic Identities for BRDF/Lighting Transfer and Image Consistency , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  K. Torrance,et al.  Microfacet Models for Refraction through Rough Surfaces , 2007, Rendering Techniques.

[56]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[58]  Robert L. Cook,et al.  A Reflectance Model for Computer Graphics , 1987, TOGS.

[59]  J. Kajiya The rendering equation , 1986, SIGGRAPH.

[60]  Michael J. Black,et al.  Supplemental: Learning an Animatable Detailed 3D Face Model from In-The-Wild Images , 2021 .

[61]  Eric Veach Robust Monte Carlo methods for light transport simulation , 1997 .