Practical Face Reconstruction via Differentiable Ray Tracing

We present a differentiable ray-tracing based novel face reconstruction approach where scene attributes – 3D geometry, reflectance (diffuse, specular and roughness), pose, camera parameters, and scene illumination – are estimated from unconstrained monocular images. The proposed method models scene illumination via a novel, parameterized virtual light stage, which in-conjunction with differentiable ray-tracing, introduces a coarse-to-fine optimization formulation for face reconstruction. Our method can not only handle unconstrained illumination and self-shadows conditions, but also estimates diffuse and specular albedos. To estimate the face attributes consistently and with practical semantics, a two-stage optimization strategy systematically uses a subset of parametric attributes, where subsequent attribute estimations factor those previously estimated. For example, self-shadows estimated during the first stage, later prevent its baking into the personalized diffuse and specular albedos in the second stage. We show the efficacy of our approach in several real-world scenarios, where face attributes can be estimated even under extreme illumination conditions. Ablation studies, analyses and comparisons against several recent state-of-the-art methods show improved accuracy and versatility of our approach. With consistent face attributes reconstruction, our method leads to several style – illumination, albedo, self-shadow – edit and transfer applications, as discussed in the paper.

[1]  Michael J. Black,et al.  Generating 3D faces using Convolutional Mesh Autoencoders , 2018, ECCV.

[2]  B. Smith,et al.  Geometrical shadowing of a random rough surface , 1967 .

[3]  Andrew Jones,et al.  Mesoscopic Facial Geometry Inference Using Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Hao Li,et al.  Photorealistic Facial Texture Inference Using Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[6]  Stefanos Zafeiriou,et al.  AvatarMe: Realistically Renderable 3D Facial Reconstruction “In-the-Wild” , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Thabo Beeler,et al.  Single-shot high-quality facial geometry and skin appearance capture , 2020, ACM Trans. Graph..

[8]  Steve Marschner,et al.  Microfacet Models for Refraction through Rough Surfaces , 2007, Rendering Techniques.

[9]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Erik Reinhard,et al.  High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting , 2010 .

[11]  Feng Liu,et al.  Towards High-Fidelity Nonlinear 3D Face Morphable Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Hao Li,et al.  paGAN: real-time avatars using dynamic textures , 2019, ACM Trans. Graph..

[13]  Thabo Beeler,et al.  3D Morphable Face Models—Past, Present, and Future , 2020, ACM Trans. Graph..

[14]  Christian Theobalt,et al.  Reconstructing detailed dynamic face geometry from monocular video , 2013, ACM Trans. Graph..

[15]  Wan-Chun Ma,et al.  The Digital Emily Project: Achieving a Photorealistic Digital Actor , 2010, IEEE Computer Graphics and Applications.

[16]  Tomas Akenine-Möller,et al.  Fast, Minimum Storage Ray-Triangle Intersection , 1997, J. Graphics, GPU, & Game Tools.

[17]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  P. Debevec,et al.  Practical modeling and acquisition of layered facial reflectance , 2008, SIGGRAPH Asia '08.

[19]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.

[20]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Michael J. Black,et al.  Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yaser Sheikh,et al.  3D object manipulation in a single photograph using stock 3D models , 2014, ACM Trans. Graph..

[23]  Wenzel Jakob,et al.  Reparameterizing discontinuous integrands for differentiable rendering , 2019, ACM Trans. Graph..

[24]  F. E. Nicodemus,et al.  Geometrical considerations and nomenclature for reflectance , 1977 .

[25]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..

[26]  Louis Chevallier,et al.  Face Reflectance and Geometry Modeling via Differentiable Ray Tracing , 2019, ArXiv.

[27]  Frédo Durand,et al.  Style transfer for headshot portraits , 2014, ACM Trans. Graph..

[28]  Greg Humphreys,et al.  Physically Based Rendering: From Theory to Implementation , 2004 .

[29]  Hans-Peter Seidel,et al.  Shading-based dynamic shape refinement from multi-view video under general illumination , 2011, 2011 International Conference on Computer Vision.

[30]  Jeffrey F. Cohn,et al.  The 2nd 3D Face Alignment in the Wild Challenge (3DFAW-Video): Dense Reconstruction From Video , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[31]  Derek Bradley,et al.  Practical dynamic facial appearance modeling and acquisition , 2018, ACM Trans. Graph..

[32]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[33]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[34]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[35]  Sylvain Paris,et al.  Portrait lighting transfer using a mass transport approach , 2017, TOGS.

[36]  Andrew Jones,et al.  Practical multispectral lighting reproduction , 2016, ACM Trans. Graph..

[37]  John M. Snyder,et al.  Parameterized environment maps , 2001, I3D '01.

[38]  Xiaoming Liu,et al.  Nonlinear 3D Face Morphable Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Derek Bradley,et al.  High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..

[40]  Kenny Mitchell,et al.  Photo-Realistic Facial Details Synthesis From Single Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  M. Gross,et al.  Analysis of human faces using a measurement-based skin reflectance model , 2006, ACM Trans. Graph..

[42]  Pieter Peers,et al.  Estimating Specular Roughness and Anisotropy from Second Order Spherical Gradient Illumination , 2009, Comput. Graph. Forum.

[43]  Robert L. Cook,et al.  A Reflectance Model for Computer Graphics , 1987, TOGS.

[44]  Hans-Peter Seidel,et al.  FML: Face Model Learning From Videos , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Edward Angel,et al.  Interactive Computer Graphics: A Top-Down Approach with Shader-Based OpenGL , 2011 .

[46]  Hao Li,et al.  Learning Formation of Physically-Based Face Attributes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Jaakko Lehtinen,et al.  Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[48]  H. Takiwaki Measurement of skin color: practical application and theoretical considerations. , 1998, The journal of medical investigation : JMI.

[49]  Yun-Ta Tsai,et al.  Single image portrait relighting , 2019, ACM Trans. Graph..

[50]  Paul Graham,et al.  Near‐Instant Capture of High‐Resolution Facial Geometry and Reflectance , 2016, Comput. Graph. Forum.

[51]  Bernhard Egger,et al.  Morphable Face Models - An Open Framework , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[52]  Jaakko Lehtinen,et al.  Production-level facial performance capture using deep convolutional neural networks , 2016, Symposium on Computer Animation.

[53]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[54]  Xiaoming Liu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Kun Zhou,et al.  Intrinsic Face Image Decomposition with Human Face Priors , 2014, ECCV.

[56]  Michael J. Black,et al.  Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Christophe Schlick,et al.  An Inexpensive BRDF Model for Physically‐based Rendering , 1994, Comput. Graph. Forum.

[58]  Shigeo Morishima,et al.  High-fidelity facial reflectance and geometry inference from an unconstrained image , 2018, ACM Trans. Graph..

[59]  James T. Kajiya,et al.  The rendering equation , 1986, SIGGRAPH.

[60]  Paul E. Debevec,et al.  Multiview face capture using polarized spherical gradient illumination , 2011, ACM Trans. Graph..

[61]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[63]  Yaser Sheikh,et al.  Modeling Facial Geometry Using Compositional VAEs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[64]  Patrick Pérez,et al.  Deep video portraits , 2018, ACM Trans. Graph..

[65]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  Hans-Peter Seidel,et al.  Lightweight binocular facial performance capture under uncontrolled lighting , 2012, ACM Trans. Graph..

[67]  P. Hanrahan,et al.  On the relationship between radiance and irradiance: determining the illumination from images of a convex Lambertian object. , 2001, Journal of the Optical Society of America. A, Optics, image science, and vision.

[68]  P. Ekman,et al.  What the face reveals : basic and applied studies of spontaneous expression using the facial action coding system (FACS) , 2005 .