Real time 3D face alignment with Random Forests-based Active Appearance Models

Many desirable applications dealing with automatic face analysis rely on robust facial feature localization. While extensive research has been carried out on standard 2D imagery, recent technological advances made the acquisition of 3D data both accurate and affordable, opening new ways to more accurate and robust algorithms. We present a model-based approach to real time face alignment, fitting a 3D model to depth and intensity images of unseen expressive faces. We use random regression forests to drive the fitting in an Active Appearance Model framework. We thoroughly evaluated the proposed approach on publicly available datasets and show how adding the depth channel boosts the robustness and accuracy of the algorithm.

[1]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[2]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[3]  Patrick J. Flynn,et al.  Multiple Nose Region Matching for 3D Face Recognition under Varying Facial Expression , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Michael J. Black,et al.  Robust Principal Component Analysis for Computer Vision , 2001, ICCV.

[5]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[7]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  K. Walker,et al.  View-based active appearance models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[9]  Martin Breidt,et al.  Robust semantic analysis by synthesis of 3D facial motion , 2011, Face and Gesture 2011.

[10]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[11]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[12]  Roland Göcke,et al.  Iterative Error Bound Minimisation for AAM Alignment , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Timothy F. Cootes,et al.  Accurate Regression Procedures for Active Appearance Models , 2011, BMVC.

[14]  Jim Austin,et al.  Automatic Keypoint Detection on 3D Faces Using a Dictionary of Local Shapes , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[15]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[16]  David J. Kriegman,et al.  Localizing parts of faces using a consensus of exemplars , 2011, CVPR.

[17]  Thomas Vetter,et al.  Optimal landmark detection using shape models and branch and bound , 2011, 2011 International Conference on Computer Vision.

[18]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[19]  Roland Goecke,et al.  Learning active appearance models from image sequences , 2006 .

[20]  Luc Van Gool,et al.  A 3-D Audio-Visual Corpus of Affective Communication , 2010, IEEE Transactions on Multimedia.

[21]  Luc Van Gool,et al.  Face/Off: live facial puppetry , 2009, SCA '09.

[22]  Maja Pantic,et al.  Facial point detection using boosted regression and graph models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Luc Van Gool,et al.  Hough Forest-Based Facial Expression Recognition from Video Sequences , 2010, ECCV Workshops.

[24]  Ioannis A. Kakadiaris,et al.  Accurate Landmarking of Three-Dimensional Facial Data in the Presence of Facial Expressions and Occlusions Using a Three-Dimensional Statistical Facial Feature Model , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Luc Van Gool,et al.  Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Hao Li,et al.  Realtime performance-based facial animation , 2011, ACM Trans. Graph..

[28]  Ralph Gross,et al.  Generic vs. person specific active appearance models , 2005, Image Vis. Comput..

[29]  Luc Van Gool,et al.  Fast 3D Scanning with Automatic Motion Compensation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Andrea Cavallaro,et al.  3-D Face Detection, Landmark Localization, and Registration Using a Point Distribution Model , 2009, IEEE Transactions on Multimedia.

[32]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[33]  Luc Van Gool,et al.  Real-time face pose estimation from single range images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Luc Van Gool,et al.  Random Forests for Real Time 3D Face Analysis , 2012, International Journal of Computer Vision.

[35]  Luc Van Gool,et al.  Does Human Action Recognition Benefit from Pose Estimation? , 2011, BMVC.

[36]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..