"Look Ma, No Landmarks!" - Unsupervised, Model-Based Dense Face Alignment

In this paper, we show how to train an image-to-image network to predict dense correspondence between a face image and a 3D morphable model using only the model for supervision. We show that both geometric parameters (shape, pose and camera intrinsics) and photometric parameters (texture and lighting) can be inferred directly from the correspondence map using linear least squares and our novel inverse spherical harmonic lighting model. The least squares residuals provide an unsupervised training signal that allows us to avoid artefacts common in the literature such as shrinking and conservative underfitting. Our approach uses a network that is 10ˆ smaller than parameter regression networks, significantly reduces sensitivity to image alignment and allows known camera calibration or multi-image constraints to be incorporated during inference. We achieve results competitive with state-of-the-art but without any auxiliary supervision used by previous methods.

[1]  Jiaolong Yang,et al.  Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Ron Kimmel,et al.  Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Xi Zhou,et al.  Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network , 2018, ECCV.

[5]  M. Zollhöfer,et al.  Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Volker Blanz,et al.  Automated 3D Face Reconstruction from Multiple Images Using Quality Measures , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Daniel E. Crispell,et al.  Pix2Face: Direct 3D Face Model Estimation , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[9]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[10]  William T. Freeman,et al.  Unsupervised Training for 3D Morphable Model Regression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Feng Liu,et al.  Landmark-Based 3D Face Reconstruction from an Arbitrary Number of Unconstrained Images , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[12]  Carlos D. Castillo,et al.  SfSNet: Learning Shape, Reflectance and Illuminance of Faces 'in the Wild' , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Pat Hanrahan,et al.  An efficient representation for irradiance environment maps , 2001, SIGGRAPH.

[15]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Junjie Yan,et al.  Learn to Combine Multiple Hypotheses for Accurate Face Alignment , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[17]  Hans-Peter Seidel,et al.  FML: Face Model Learning From Videos , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  George Trigeorgis,et al.  Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  Qijun Zhao,et al.  Evaluation of Dense 3D Reconstruction from 2D Face Images in the Wild , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[21]  Justus Thies,et al.  InverseFaceNet: Deep Single-Shot Inverse Face Rendering From A Single Image , 2017, ArXiv.

[22]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[23]  Xiaoming Liu,et al.  Nonlinear 3D Face Morphable Model , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Matan Sela,et al.  3D Face Reconstruction by Learning from Synthetic Data , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[25]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Tal Hassner,et al.  Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Iasonas Kokkinos,et al.  DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[30]  Justus Thies,et al.  InverseFaceNet: Deep Monocular Inverse Face Rendering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[32]  Oswald Aldrian,et al.  Inverse Rendering of Faces with a 3D Morphable Model , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Xiaoming Liu,et al.  On Learning 3D Face Morphable Model from In-the-Wild Images , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Thabo Beeler,et al.  3D Morphable Face Models—Past, Present, and Future , 2020, ACM Trans. Graph..

[35]  Jason Yosinski,et al.  An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution , 2018, NeurIPS.

[36]  Michael J. Black,et al.  Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Xiaoming Liu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Hao Li,et al.  Learning Dense Facial Correspondences in Unconstrained Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).