Dense 3D face alignment from 2D video for real-time use

To enable real-time, person-independent 3D registration from 2D video, we developed a 3D cascade regression approach in which facial landmarks remain invariant across pose over a range of approximately 60 degrees. From a single 2D image of a person's face, a dense 3D shape is registered in real time for each frame. The algorithm utilizes a fast cascade regression framework trained on high-resolution 3D face-scans of posed and spontaneous emotion expression. The algorithm first estimates the location of a dense set of landmarks and their visibility, then reconstructs face shapes by fitting a part-based 3D model. Because no assumptions are required about illumination or surface properties, the method can be applied to a wide range of imaging conditions that include 2D video and uncalibrated multi-view video. The method has been validated in a battery of experiments that evaluate its precision of 3D reconstruction, extension to multi-view reconstruction, temporal integration for videos and 3D head-pose estimation. Experimental findings strongly support the validity of real-time, 3D registration and reconstruction from 2D video. The software is available online at http://zface.org.

[1]  Takeo Kanade,et al.  Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models , 2007, International Journal of Computer Vision.

[2]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[3]  Jing Xiao,et al.  Robust full‐motion recovery of head by dynamic templates and re‐registration techniques , 2003 .

[4]  Marco La Cascia,et al.  Fast, reliable head tracking under varying illumination , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[5]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[6]  Pascal Fua,et al.  Accurate face models from uncalibrated and ill-lit video sequences , 2004, CVPR 2004.

[7]  Fernando De la Torre,et al.  Driver Gaze Tracking and Eyes Off the Road Detection System , 2015, IEEE Transactions on Intelligent Transportation Systems.

[8]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[10]  Dimitris N. Metaxas,et al.  Large Scale Learning of Active Shape Models , 2007, 2007 IEEE International Conference on Image Processing.

[11]  Tal Hassner,et al.  Viewing Real-World Faces in 3D , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  B. Lévy Laplace-Beltrami Eigenfunctions Towards an algorithm , 2006 .

[14]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Pietro Perona,et al.  Cascaded pose regression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[17]  Bruno Lévy,et al.  Laplace-Beltrami Eigenfunctions Towards an Algorithm That "Understands" Geometry , 2006, IEEE International Conference on Shape Modeling and Applications 2006 (SMI'06).

[18]  Ira Kemelmacher-Shlizerman,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 3d Face Reconstruction from a Single Image Using a Single Reference Face Shape , 2022 .

[19]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[20]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Maja Pantic,et al.  Local Evidence Aggregation for Regression-Based Facial Point Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[24]  Jeffrey R. Spies,et al.  Something in the way we move: Motion dynamics, not perceived sex, influence head movements in conversation. , 2011, Journal of experimental psychology. Human perception and performance.

[25]  Daniel Cremers,et al.  The wave kernel signature: A quantum mechanical approach to shape analysis , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[26]  Fernando De la Torre,et al.  Interactive region-based linear 3D face models , 2011, SIGGRAPH 2011.

[27]  Vincent Lepetit,et al.  BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[29]  Zoran Popovic,et al.  The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[30]  K. Mardia,et al.  Statistical Shape Analysis , 1998 .

[31]  Zicheng Liu,et al.  Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach , 2004, International Journal of Computer Vision.

[32]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  J. Mixter Fast , 2012 .

[35]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Kun Zhou,et al.  3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[39]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[40]  Ira Kemelmacher-Shlizerman,et al.  Total Moving Face Reconstruction , 2014, ECCV.

[41]  Maja Pantic,et al.  Facial point detection using boosted regression and graph models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[43]  Myung Jin Chung,et al.  3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[44]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[45]  Nicu Sebe,et al.  Regressing a 3D Face Shape from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Yoichi Sato,et al.  Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates , 2007, International Journal of Computer Vision.

[47]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[48]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[49]  Takeo Kanade,et al.  3D Alignment of Face in a Single Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[50]  Nicu Sebe,et al.  Combining Head Pose and Eye Location Information for Gaze Estimation , 2012, IEEE Transactions on Image Processing.

[51]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[52]  Stefanos Kollias,et al.  A natural head pose and eye gaze dataset , 2009, AFFINE '09.

[53]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[54]  Laszlo A. Jeni,et al.  Spontaneous facial expression in unscripted social interactions can be measured automatically , 2015, Behavior research methods.