论文信息 - Dense 3D face alignment from 2D video for real-time use

Dense 3D face alignment from 2D video for real-time use

To enable real-time, person-independent 3D registration from 2D video, we developed a 3D cascade regression approach in which facial landmarks remain invariant across pose over a range of approximately 60 degrees. From a single 2D image of a person's face, a dense 3D shape is registered in real time for each frame. The algorithm utilizes a fast cascade regression framework trained on high-resolution 3D face-scans of posed and spontaneous emotion expression. The algorithm first estimates the location of a dense set of landmarks and their visibility, then reconstructs face shapes by fitting a part-based 3D model. Because no assumptions are required about illumination or surface properties, the method can be applied to a wide range of imaging conditions that include 2D video and uncalibrated multi-view video. The method has been validated in a battery of experiments that evaluate its precision of 3D reconstruction, extension to multi-view reconstruction, temporal integration for videos and 3D head-pose estimation. Experimental findings strongly support the validity of real-time, 3D registration and reconstruction from 2D video. The software is available online at http://zface.org.

Takeo Kanade | Jeffrey F. Cohn | László A. Jeni | J. Cohn | T. Kanade

[1] Takeo Kanade,et al. Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models , 2007, International Journal of Computer Vision.

[2] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[3] Jing Xiao,et al. Robust full‐motion recovery of head by dynamic templates and re‐registration techniques , 2003 .

[4] Marco La Cascia,et al. Fast, reliable head tracking under varying illumination , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[5] Jian Sun,et al. Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[6] Pascal Fua,et al. Accurate face models from uncalibrated and ill-lit video sequences , 2004, CVPR 2004.

[7] Fernando De la Torre,et al. Driver Gaze Tracking and Eyes Off the Road Detection System , 2015, IEEE Transactions on Intelligent Transportation Systems.

[8] Xiaogang Wang,et al. Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Simon Baker,et al. Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[10] Dimitris N. Metaxas,et al. Large Scale Learning of Active Shape Models , 2007, 2007 IEEE International Conference on Image Processing.

[11] Tal Hassner,et al. Viewing Real-World Faces in 3D , 2013, 2013 IEEE International Conference on Computer Vision.

[12] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13] B. Lévy. Laplace-Beltrami Eigenfunctions Towards an algorithm , 2006 .

[14] Peter Robinson,et al. 3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Pietro Perona,et al. Cascaded pose regression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16] Shaun J. Canavan,et al. BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[17] Bruno Lévy,et al. Laplace-Beltrami Eigenfunctions Towards an Algorithm That "Understands" Geometry , 2006, IEEE International Conference on Shape Modeling and Applications 2006 (SMI'06).

[18] Ira Kemelmacher-Shlizerman,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 3d Face Reconstruction from a Single Image Using a Single Reference Face Shape , 2022 .

[19] Simon Lucey,et al. Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[20] Stefanos Zafeiriou,et al. Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Maja Pantic,et al. Local Evidence Aggregation for Regression-Based Facial Point Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Jian Sun,et al. Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Thomas Vetter,et al. A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[24] Jeffrey R. Spies,et al. Something in the way we move: Motion dynamics, not perceived sex, influence head movements in conversation. , 2011, Journal of experimental psychology. Human perception and performance.

[25] Daniel Cremers,et al. The wave kernel signature: A quantum mechanical approach to shape analysis , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[26] Fernando De la Torre,et al. Interactive region-based linear 3D face models , 2011, SIGGRAPH 2011.

[27] Vincent Lepetit,et al. BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Nathan Halko,et al. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[29] Zoran Popovic,et al. The space of human body shapes: reconstruction and parameterization from range scans , 2003, ACM Trans. Graph..

[30] K. Mardia,et al. Statistical Shape Analysis , 1998 .

[31] Zicheng Liu,et al. Robust and Rapid Generation of Animated Faces from Video Images: A Model-Based Modeling Approach , 2004, International Journal of Computer Vision.

[32] Luc Van Gool,et al. Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Fernando De la Torre,et al. Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34] J. Mixter. Fast , 2012 .

[35] Josephine Sullivan,et al. One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Mohan M. Trivedi,et al. Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Marco La Cascia,et al. Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38] Kun Zhou,et al. 3D shape regression for real-time facial animation , 2013, ACM Trans. Graph..

[39] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[40] Ira Kemelmacher-Shlizerman,et al. Total Moving Face Reconstruction , 2014, ECCV.

[41] Maja Pantic,et al. Facial point detection using boosted regression and graph models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42] Kun Zhou,et al. Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[43] Myung Jin Chung,et al. 3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[44] David Cristinacce,et al. Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[45] Nicu Sebe,et al. Regressing a 3D Face Shape from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46] Yoichi Sato,et al. Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates , 2007, International Journal of Computer Vision.

[47] Pietro Perona,et al. Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[48] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[49] Takeo Kanade,et al. 3D Alignment of Face in a Single Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[50] Nicu Sebe,et al. Combining Head Pose and Eye Location Information for Gaze Estimation , 2012, IEEE Transactions on Image Processing.

[51] Lijun Yin,et al. A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[52] Stefanos Kollias,et al. A natural head pose and eye gaze dataset , 2009, AFFINE '09.

[53] Shiguang Shan,et al. Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[54] Laszlo A. Jeni,et al. Spontaneous facial expression in unscripted social interactions can be measured automatically , 2015, Behavior research methods.