Monocular Reconstruction of Neural Face Reflectance Fields

The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflectance as higher-order global illumination effects and self-shadowing are not modeled. We present a new neural representation for face reflectance where we can estimate all components of the reflectance responsible for the final appearance from a single monocular image. Instead of modeling each component of the reflectance separately using parametric models, our neural representation allows us to generate a basis set of faces in a geometric deformation-invariant space, parameterized by the input light direction, viewpoint and face geometry. We learn to reconstruct this reflectance field of a face just from a monocular image, which can be used to render the face from any viewpoint in any light condition. Our method is trained on a light-stage training dataset, which captures 300 people illuminated with 150 light conditions from 8 viewpoints. We show that our method outperforms existing monocular reflectance reconstruction methods, in terms of photorealism due to better capturing of physical premitives, such as sub-surface scattering, specularities, self-shadows and other higher-order effects.

[1]  Hao Li,et al.  Learning Formation of Physically-Based Face Attributes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Simon Lucey,et al.  Real-time avatar animation from a single image , 2011, Face and Gesture 2011.

[3]  BeelerThabo,et al.  3D Morphable Face Models—Past, Present, and Future , 2020 .

[4]  Bernhard Egger,et al.  A Morphable Face Albedo Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yannick Hold-Geoffroy,et al.  Deep Sky Modeling for Single Image Outdoor Lighting Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Simon Lucey,et al.  Real-time avatar animation from a single image , 2011, Face and Gesture 2011.

[7]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[8]  William A. P. Smith,et al.  Inverse Rendering of Faces on a Cloudy Day , 2012, ECCV.

[9]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Paul E. Debevec,et al.  Acquiring the reflectance field of a human face , 2000, SIGGRAPH.

[12]  Matan Sela,et al.  Learning Detailed Face Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Stefanos Zafeiriou,et al.  AvatarMe: Realistically Renderable 3D Facial Reconstruction “In-the-Wild” , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Hans-Peter Seidel,et al.  FML: Face Model Learning From Videos , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Pierre Alliez,et al.  Polygon Mesh Processing , 2010 .

[17]  Feng Liu,et al.  Towards High-Fidelity Nonlinear 3D Face Morphable Model , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..

[19]  David W. Jacobs,et al.  Deep Single-Image Portrait Relighting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Shigeo Morishima,et al.  High-fidelity facial reflectance and geometry inference from an unconstrained image , 2018, ACM Trans. Graph..

[21]  LalondeJean-François,et al.  Learning to predict indoor illumination from a single image , 2017 .

[22]  Yun-Ta Tsai,et al.  Single image portrait relighting , 2019, ACM Trans. Graph..

[23]  Patrick Pérez,et al.  Corrective 3D reconstruction of lips from monocular video , 2016, ACM Trans. Graph..

[24]  Xiaoming Liu,et al.  On Learning 3D Face Morphable Model from In-the-Wild Images , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Pat Hanrahan,et al.  A signal-processing framework for inverse rendering , 2001, SIGGRAPH.

[26]  Qionghai Dai,et al.  Capturing Relightable Human Performances under General Uncontrolled Illumination , 2013, Comput. Graph. Forum.

[27]  Jean-François Lalonde,et al.  Learning Physics-Guided Face Relighting Under Directional Light , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jaakko Lehtinen,et al.  Modular primitives for high-performance differentiable rendering , 2020, ACM Trans. Graph..

[29]  Paul E. Debevec,et al.  Multiview face capture using polarized spherical gradient illumination , 2011, ACM Trans. Graph..

[30]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[31]  Paul E. Debevec,et al.  Cosine Lobe Based Relighting from Gradient Illumination Photographs , 2009, 2009 Conference for Visual Media Production.

[32]  Tal Hassner,et al.  Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Bernhard Egger,et al.  Efficient Global Illumination for Morphable Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  M. Gross,et al.  Analysis of human faces using a measurement-based skin reflectance model , 2006, ACM Trans. Graph..

[35]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[37]  Ersin Yumer,et al.  Learning to predict indoor illumination from a single image , 2017, ACM Trans. Graph..

[38]  M. Zollhöfer,et al.  Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Thabo Beeler,et al.  3D Morphable Face Models—Past, Present, and Future , 2020, ACM Trans. Graph..

[40]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[41]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.