A 3D Approach to Facial Landmarks: Detection, Refinement, and Tracking

A real-time algorithm for accurate localization of facial landmarks in a single monocular image is proposed. The algorithm is formulated as an optimization problem, in which the sum of responses of local classifiers is maximized with respect to the camera pose by fitting a generic (not a person-specific) 3D model. The algorithm simultaneously estimates a head position and orientation and detects the facial landmarks in the image. Despite being local, we show that the basin of attraction is large to the extent it can be initialized by a scanning window face detector. Other experiments on standard datasets demonstrate that the proposed algorithm outperforms a state-of-the-art landmark detector especially for non-frontal face images, and that it is capable of reliable and stable tracking for large set of viewing angles.

[1]  Alexander J. Smola,et al.  Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[2]  Jiri Matas,et al.  WaldBoost - learning for time constrained sequential detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Tao Wang,et al.  Fast facial landmark detection using cascade classifiers and a simple 3D model , 2011, 2011 18th IEEE International Conference on Image Processing.

[4]  Timothy F. Cootes,et al.  Active Shape Models - 'smart snakes' , 1992, BMVC.

[5]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[7]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[8]  Takeo Kanade,et al.  3D Alignment of Face in a Single Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[10]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[11]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[12]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[13]  Thomas Vetter,et al.  Optimal landmark detection using shape models and branch and bound , 2011, 2011 International Conference on Computer Vision.

[14]  Václav Hlavác,et al.  Detector of Facial Landmarks Learned by the Structured Output SVM , 2012, VISAPP.

[15]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[16]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[17]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[18]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.