Spontaneous facial expression recognition: A robust metric learning approach

Spontaneous facial expression recognition is significantly more challenging than recognizing posed ones. We focus on two issues that are still under-addressed in this area. First, due to the inherent subtlety, the geometric and appearance features of spontaneous expressions tend to overlap with each other, making it hard for classifiers to find effective separation boundaries. Second, the training set usually contains dubious class labels which can hurt the recognition performance if no countermeasure is taken. In this paper, we propose a spontaneous expression recognition method based on robust metric learning with the aim of alleviating these two problems. In particular, to increase the discrimination of different facial expressions, we learn a new metric space in which spatially close data points have a higher probability of being in the same class. In addition, instead of using the noisy labels directly for metric learning, we define sensitivity and specificity to characterize the annotation reliability of each annotator. Then the distance metric and annotators' reliability is jointly estimated by maximizing the likelihood of the observed class labels. With the introduction of latent variables representing the true class labels, the distance metric and annotators' reliability can be iteratively solved under the Expectation Maximization framework. Comparative experiments show that our method achieves better recognition accuracy on spontaneous expression recognition, and the learned metric can be reliably transferred to recognize posed expressions.

[1]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[3]  Jacob Whitehill,et al.  Haar features for FACS AU recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[4]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[5]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[6]  Ole Winther,et al.  Mean Field Methods for Classification with Gaussian Processes , 1998, NIPS.

[7]  Lei Wang,et al.  Positive Semidefinite Metric Learning with Boosting , 2009, NIPS.

[8]  Zenglin Xu,et al.  Robust Metric Learning by Smooth Optimization , 2010, UAI.

[9]  PanticMaja,et al.  A Survey of Affect Recognition Methods , 2009 .

[10]  Maja Pantic Automatic Analysis of Facial Expressions , 2014, 2014 9th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[11]  Ioannis Pitas,et al.  Texture and shape information fusion for facial expression and facial action unit recognition , 2008, Pattern Recognit..

[12]  Margaret Lech,et al.  Averaged Gabor Filter Features for Facial Expression Recognition , 2008, 2008 Digital Image Computing: Techniques and Applications.

[13]  Koichi Yamada,et al.  An HMM based Model for Prediction of Emotional Composition of a Facial Expression using both Significant and Insignificant Action Units and Associated Gender Differences , 2012 .

[14]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[15]  Léon J. M. Rothkrantz,et al.  A Bayesian approach to recognise facial expressions using vector flows , 2009, CompSysTech '09.

[16]  David P. Williams,et al.  Classification with imperfect labels for fault prediction , 2011, KDD4Service '11.

[17]  J. Russell,et al.  Judgments of emotion from spontaneous facial expressions of New Guineans. , 2007, Emotion.

[18]  Simon Lucey,et al.  Face alignment through subspace constrained mean-shifts , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Sungsoo Park,et al.  Spontaneous facial expression classification with facial motion vectors , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[20]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[21]  P. Ekman,et al.  Unmasking the Face: A Guide to Recognizing Emotions From Facial Expressions , 1975 .

[22]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Anastasios Tefas,et al.  Salient feature and reliable classifier selection for facial expression classification , 2010, Pattern Recognit..

[24]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[26]  P. Ekman,et al.  Unmasking the face : a guide to recognizing emotions from facial clues , 1975 .

[27]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[28]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Bo Wu,et al.  Real time facial expression recognition with AdaBoost , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[30]  Jake K. Aggarwal,et al.  A scalable metric learning-based voting method for expression recognition , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[31]  Jeffrey F. Cohn,et al.  The Timing of Facial Motion in posed and Spontaneous Smiles , 2003, Int. J. Wavelets Multiresolution Inf. Process..

[32]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[33]  M. Bartlett,et al.  Machine Analysis of Facial Expressions , 2007 .

[34]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[35]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[36]  Chung-Lin Huang,et al.  Hybrid-Boost Learning for Multi-Pose Face Detection and Facial Expression Recognition , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[37]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[38]  Anil K. Jain,et al.  Handbook of Face Recognition, 2nd Edition , 2011 .

[39]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[40]  Maja Pantic,et al.  Facial action recognition for facial expression analysis from static face images , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  John G. Daugman,et al.  Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression , 1988, IEEE Trans. Acoust. Speech Signal Process..

[42]  Inderjit S. Dhillon,et al.  Structured metric learning for high dimensional problems , 2008, KDD.

[43]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[44]  Ganesh K. Venayagamoorthy,et al.  Recognition of facial expressions using Gabor wavelets and learning vector quantization , 2008, Eng. Appl. Artif. Intell..

[45]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[46]  Kazuo Okanoya,et al.  The Mysterious Noh Mask: Contribution of Multiple Facial Parts to the Recognition of Emotional Expressions , 2012, PloS one.

[47]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[48]  Alice J. O'Toole,et al.  A video database of moving faces and people , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Jake K. Aggarwal,et al.  Facial expression recognition with temporal modeling of shapes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[50]  Maja Pantic,et al.  Spontaneous vs. posed facial behavior: automatic analysis of brow actions , 2006, ICMI '06.

[51]  John Mount,et al.  The equivalence of logistic regression and maximum entropymodels , 2011 .