A discriminative parts based model approach for fiducial points free and shape constrained head pose normalisation in the wild

This paper proposes a method for parts-based view-invariant head pose normalisation, which works well even in difficult real-world conditions. Handling pose is a classical problem in facial analysis. Recently, parts-based models have shown promising performance for facial landmark points detection `in the wild'. Leveraging on the success of these models, the proposed data-driven regression framework computes a constrained normalised virtual frontal head pose. The response maps of a discriminatively trained part detector are used as texture information. These sparse texture maps are projected from non-frontal to frontal pose using block-wise structured regression. Finally, a facial kinematic shape constraint is achieved by applying a shape model. The advantages of the proposed approach are: a) no explicit dependence on the outputs of a facial parts detector and, thus, avoiding any error propagation owing to their failure; (b) the application of a shape prior on the reconstructed frontal maps provides an anatomically constrained facial shape; and c) modelling head pose as a mixture-of-parts model allows the framework to work without any prior pose information. Experiments are performed on the Multi-PIE and the `in the wild' SFEW databases. The results demonstrate the effectiveness of the proposed method.

[1]  Alexei A. Efros,et al.  Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[2]  K. Walker,et al.  View-based active appearance models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[3]  Tamás D. Gedeon,et al.  Collecting Large, Richly Annotated Facial-Expression Databases from Movies , 2012, IEEE MultiMedia.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[6]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[7]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Michael J. Jones,et al.  Fully automatic pose-invariant face recognition via 3D pose normalization , 2011, 2011 International Conference on Computer Vision.

[9]  I Biederman,et al.  Neurocomputational bases of object and face recognition. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[10]  Maja Pantic,et al.  Coupled Gaussian Process Regression for Pose-Invariant Facial Expression Recognition , 2010, ECCV.

[11]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[12]  Lijun Yin,et al.  Multi-view facial expression recognition , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[13]  Richard Bowden,et al.  Local binary patterns for multi-view facial expression recognition , 2011 .

[14]  Maja Pantic,et al.  Regression-Based Multi-view Facial Expression Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[15]  Tamás D. Gedeon,et al.  Learning based automatic face annotation for arbitrary poses and expressions from frontal images only , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[17]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[19]  Maja Pantic,et al.  Shape-constrained Gaussian process regression for facial-point-based head-pose normalization , 2011, 2011 International Conference on Computer Vision.

[20]  Sridha Sridharan,et al.  In the Pursuit of Effective Affective Computing: The Relationship Between Features and Registration , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Cristian Sminchisescu,et al.  Twin Gaussian Processes for Structured Prediction , 2010, International Journal of Computer Vision.