Deep 3D morphable model refinement via progressive growing of conditional Generative Adversarial Networks

Abstract 3D face reconstruction from a single 2D image is a fundamental Computer Vision problem of extraordinary difficulty. Statistical modeling techniques, such as the 3D Morphable Model (3DMM), have been widely exploited because of their capability of reconstructing a plausible model grounding on the prior knowledge of the facial shape. However, most of these techniques derive an approximated and smooth reconstruction of the face, without accounting for fine-grained details. In this work, we propose an approach based on a Conditional Generative Adversarial Network (CGAN) for refining the coarse reconstruction provided by a 3DMM. The latter is represented as a three channels image, where the pixel intensities represent the depth, curvature and elevation values of the 3D vertices. The architecture is an encoder–decoder, which is trained progressively, starting from the lower-resolution layers; this technique allows a more stable training, which leads to the generation of high quality outputs even when high-resolution images are fed during the training. Experimental results show that our method is able to produce reconstructions with fine-grained realistic details and lower reconstruction errors with respect to the 3DMM. A cross-dataset evaluation also shows that the network retains good generalization capabilities. Finally, comparison with state-of-the-art solutions evidence competitive performance, with comparable or lower error in most of the cases, and a clear improvement in the quality of the generated models.

[1]  Yaonan Wang,et al.  3D Face Reconstruction from Light Field Images: A Model-free Approach , 2017, ECCV.

[2]  Thomas Vetter,et al.  Estimating Coloured 3D Face Models from Single Images: An Example Based Approach , 1998, ECCV.

[3]  Qijun Zhao,et al.  Examplar coherent 3D face reconstruction from forensic mugshot database , 2017, Image Vis. Comput..

[4]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[5]  Robert J. Woodham,et al.  Photometric method for determining surface orientation from multiple images , 1980 .

[6]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[7]  Stefanos Zafeiriou,et al.  Large Scale 3D Morphable Models , 2017, International Journal of Computer Vision.

[8]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[11]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[12]  Ira Kemelmacher-Shlizerman,et al.  Head Reconstruction from Internet Photos , 2016, ECCV.

[13]  Lijun Yin,et al.  Static and dynamic 3D facial expression recognition: A comprehensive survey , 2012, Image Vis. Comput..

[14]  Hao Li,et al.  Real-Time Facial Segmentation and Performance Capture from RGB Input , 2016, ECCV.

[15]  Yiannis Kompatsiaris,et al.  Deep Learning Advances in Computer Vision with 3D Data , 2017, ACM Comput. Surv..

[16]  Alberto Del Bimbo,et al.  A Dictionary Learning-Based 3D Morphable Shape Model , 2017, IEEE Transactions on Multimedia.

[17]  R. Basri,et al.  Statistical Symmetric Shape from Shading for 3D Structure Recovery of Faces , 2004, eccv 2004.

[18]  David Cohen-Steiner,et al.  Restricted delaunay triangulations and normal cycle , 2003, SCG '03.

[19]  Ira Kemelmacher-Shlizerman,et al.  Face Reconstruction from a Single Image using a Single Reference Face Shape , 2009 .

[20]  Ajmal S. Mian,et al.  Deep, dense and accurate 3D face correspondence for generating population specific deformable models , 2017, Pattern Recognit..

[21]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[22]  Arman Savran,et al.  Bosphorus Database for 3D Face Analysis , 2008, BIOID.

[23]  Q. M. Jonathan Wu,et al.  A survey of local feature methods for 3D face recognition , 2017, Pattern Recognit..

[24]  Berthold K. P. Horn SHAPE FROM SHADING: A METHOD FOR OBTAINING THE SHAPE OF A SMOOTH OPAQUE OBJECT FROM ONE VIEW , 1970 .

[25]  James F. Blinn,et al.  Simulation of wrinkled surfaces , 1978, SIGGRAPH.

[26]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[27]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.