Non-linear predictors for facial feature tracking across pose and expression

This paper proposes a non-linear predictor for estimating the displacement of tracked feature points on faces that exhibit significant variations across pose and expression. Existing methods such as linear predictors, ASMs or AAMs are limited to a narrow range in pose. In order to track across a large pose range, separate pose-specific models are required that are then coupled via a pose-estimator. In our approach, we neither require a set of pose-specific models nor a pose-estimator. Using just a single tracking model, we are able to robustly and accurately track across a wide range of expression on poses. This is achieved by gradient boosting of regression trees for predicting the displacement vectors of tracked points. Additionally, we propose a novel algorithm for simultaneously configuring this hierarchical set of trackers for optimal tracking results. Experiments were carried out on sequences of naturalistic conversation and sequences with large pose and expression changes. The results show that the proposed method is superior to state of the art methods, in being able to robustly track a set of facial points whilst gracefully recovering from tracking failures.

[1]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[4]  Barry-John Theobald,et al.  Robust facial feature tracking using selected multi-resolution linear predictors , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Dimitris N. Metaxas,et al.  Tracking Facial Features Using Mixture of Point Distribution Models , 2006, ICVGIP.

[6]  Liya Ding,et al.  Features versus Context: An Approach for Precise and Detailed Detection and Delineation of Faces and Facial Features , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Richard Bowden,et al.  Cultural factors in the regression of non-verbal communication perception , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[8]  Fadi Dornaika,et al.  Fitting 3D face models for tracking and active appearance model training , 2006, Image Vis. Comput..

[9]  Li Zhang,et al.  Robust face alignment based on local texture classifiers , 2005, IEEE International Conference on Image Processing 2005.

[10]  Timothy F. Cootes,et al.  Accurate Regression Procedures for Active Appearance Models , 2011, BMVC.

[11]  Paul A. Bromiley,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Takeo Kanade,et al.  Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models , 2007, International Journal of Computer Vision.

[13]  Marco Maggini,et al.  Auto Associative Neural Network based Active Shape Models , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[14]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[15]  Timothy F. Cootes,et al.  Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Jing Xiao,et al.  Real-time combined 2D+3D active appearance models , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Alejandro F. Frangi,et al.  Active Shape Models with Invariant Optimal Features: Application to Facial Analysis , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.