Photo-Realistic Facial Details Synthesis From Single Image

We present a single-image 3D face synthesis technique that can handle challenging facial expressions while recovering fine geometric details. Our technique employs expression analysis for proxy face geometry generation and combines supervised and unsupervised learning for facial detail synthesis. On proxy generation, we conduct emotion prediction to determine a new expression-informed proxy. On detail synthesis, we present a Deep Facial Detail Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs both geometry and appearance loss functions. For geometry, we capture 366 high-quality 3D scans from 122 different subjects under 3 facial expressions. For appearance, we use additional 163K in-the-wild face images and apply image-based rendering to accommodate lighting variations. Comprehensive experiments demonstrate that our framework can produce high-quality 3D faces with realistic details under challenging facial expressions.

[1]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[2]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[3]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[4]  Sami Romdhani,et al.  Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Nahum Kiryati,et al.  Photometric stereo under perspective projection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, ACM Trans. Graph..

[7]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[8]  Pieter Peers,et al.  Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination , 2007 .

[9]  Roberto Cipolla,et al.  Multiview Photometric Stereo , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[11]  Gang Hua,et al.  Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Daniel G. Aliaga,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 a Self-calibrating Method for Photogeometric Acquisition of 3d Objects , 2022 .

[13]  Wan-Chun Ma,et al.  The Digital Emily Project: Achieving a Photorealistic Digital Actor , 2010, IEEE Computer Graphics and Applications.

[14]  Paul E. Debevec,et al.  Effect of illumination on automatic expression recognition: A novel 3D relightable facial database , 2011, Face and Gesture 2011.

[15]  Ira Kemelmacher-Shlizerman,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 3d Face Reconstruction from a Single Image Using a Single Reference Face Shape , 2022 .

[16]  Paul E. Debevec,et al.  Multiview face capture using polarized spherical gradient illumination , 2011, ACM Trans. Graph..

[17]  Qionghai Dai,et al.  Fusing Multiview and Photometric Stereo for 3D Reconstruction under Uncalibrated Illumination , 2011, IEEE Transactions on Visualization and Computer Graphics.

[18]  M. Westoby,et al.  ‘Structure-from-Motion’ photogrammetry: A low-cost, effective tool for geoscience applications , 2012 .

[19]  Paolo Favaro,et al.  A New Perspective on Uncalibrated Photometric Stereo , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Kun Zhou,et al.  3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[21]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[22]  Xin Tong,et al.  Automatic acquisition of high-fidelity facial performances using monocular videos , 2014, ACM Trans. Graph..

[23]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[24]  Yvain Quéau Reconstruction tridimensionnelle par stéréophotométrie , 2015 .

[25]  Michael Goesele,et al.  A Survey of Photometric Stereo Techniques , 2015, Found. Trends Comput. Graph. Vis..

[26]  Thabo Beeler,et al.  Real-time high-fidelity facial performance capture , 2015, ACM Trans. Graph..

[27]  Xiangyu Zhu,et al.  Discriminative 3D morphable model fitting , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[28]  Peter Robinson,et al.  Cross-dataset learning and person-specific normalisation for automatic Action Unit detection , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[29]  Christian Theobalt,et al.  Reconstruction of Personalized 3D Face Rigs from Monocular Video , 2016, ACM Trans. Graph..

[30]  Matan Sela,et al.  3D Face Reconstruction by Learning from Synthetic Data , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[31]  William J. Christmas,et al.  A Multiresolution 3D Morphable Face Model and Fitting Framework , 2016, VISIGRAPP.

[32]  Bernhard Egger,et al.  Markov Chain Monte Carlo for Automated Face Image Analysis , 2016, International Journal of Computer Vision.

[33]  Hao Li,et al.  Real-Time Facial Segmentation and Performance Capture from RGB Input , 2016, ECCV.

[34]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Paul Graham,et al.  Near‐Instant Capture of High‐Resolution Facial Geometry and Reflectance , 2016, Comput. Graph. Forum.

[36]  Yasuyuki Matsushita,et al.  Robust Multiview Photometric Stereo Using Planar Mesh Parameterization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Georgios Tzimiropoulos,et al.  Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Ioannis A. Kakadiaris,et al.  End-to-End 3D Face Reconstruction with Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Chi-Ho Chan,et al.  Efficient 3D morphable face model fitting , 2017, Pattern Recognit..

[40]  Matan Sela,et al.  Learning Detailed Face Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Ron Kimmel,et al.  Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[44]  Stefanos Zafeiriou,et al.  Large Scale 3D Morphable Models , 2017, International Journal of Computer Vision.

[45]  Tal Hassner,et al.  Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Justus Thies,et al.  InverseFaceNet: Deep Monocular Inverse Face Rendering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Ramakant Nevatia,et al.  ExpNet: Landmark-Free, Deep, 3D Facial Expressions , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[49]  M. Zollhöfer,et al.  Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Kenny Mitchell,et al.  Feature-preserving detailed 3D face reconstruction from a single image , 2018, CVMP '18.

[51]  Andrew Jones,et al.  Mesoscopic Facial Geometry Inference Using Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  William T. Freeman,et al.  Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Xin Chen,et al.  Sparse Photometric 3D Face Reconstruction Guided by Morphable Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Stefanos Zafeiriou,et al.  3D Reconstruction of “In-the-Wild” Faces in Images and Videos , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Bernhard Egger,et al.  Morphable Face Models - An Open Framework , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[56]  Bailin Deng,et al.  3D Face Reconstruction With Geometry Details From a Single Image , 2017, IEEE Transactions on Image Processing.

[57]  Tal Hassner,et al.  Extreme 3D Face Reconstruction: Seeing Through Occlusions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[59]  Matias Valdenegro-Toro,et al.  Real-time Convolutional Neural Networks for emotion and gender classification , 2017, ESANN.

[60]  Jianfei Cai,et al.  CNN-Based Real-Time Dense Face Reconstruction with Inverse-Rendered Photo-Realistic Face Images , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Justus Thies,et al.  Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.