Discriminative Shared Gaussian Processes for Multiview and View-Invariant Facial Expression Recognition

Images of facial expressions are often captured from various views as a result of either head movements or variable camera position. Existing methods for multiview and/or view-invariant facial expression recognition typically perform classification of the observed expression using either classifiers learned separately for each view or a single classifier learned for all views. However, these approaches ignore the fact that different views of a facial expression are just different manifestations of the same facial expression. By accounting for this redundancy, we can design more effective classifiers for the target task. To this end, we propose a discriminative shared Gaussian process latent variable model (DS-GPLVM) for multiview and view-invariant classification of facial expressions from multiple views. In this model, we first learn a discriminative manifold shared by multiple views of a facial expression. Subsequently, we perform facial expression classification in the expression manifold. Finally, classification of an observed facial expression is carried out either in the view-invariant manner (using only a single view of the expression) or in the multiview manner (using multiple views of the expression). The proposed model can also be used to perform fusion of different facial features in a principled manner. We validate the proposed DS-GPLVM on both posed and spontaneously displayed facial expressions from three publicly available datasets (MultiPIE, labeled face parts in the wild, and static facial expressions in the wild). We show that this model outperforms the state-of-the-art methods for multiview and view-invariant facial expression classification, and several state-of-the-art methods for multiview learning and feature fusion.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Maja Pantic,et al.  Regression-Based Multi-view Facial Expression Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  Chad Hazlett,et al.  Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach , 2014, Political Analysis.

[4]  Tom Diethe,et al.  Constructing Nonlinear Discriminants from Multiple Data Views , 2010, ECML/PKDD.

[5]  Wu-Jun Li,et al.  Gaussian Process Latent Random Field , 2010, AAAI.

[6]  Thomas S. Huang,et al.  Multi-view Facial Expression Recognition Analysis with Generic Sparse Coding Feature , 2012, ECCV Workshops.

[7]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.

[8]  Zoubin Ghahramani,et al.  Semi-supervised learning : from Gaussian fields to Gaussian processes , 2003 .

[9]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[10]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Maja Pantic,et al.  Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[14]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[15]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[16]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[18]  Stefanos Zafeiriou,et al.  A Semi-automatic Methodology for Facial Landmark Annotation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Maja Pantic,et al.  Coupled Gaussian processes for pose-invariant facial expression recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Rajesh P. N. Rao,et al.  Learning Shared Latent Structure for Image Synthesis and Robotic Imitation , 2005, NIPS.

[21]  Thomas S. Huang,et al.  Emotion Recognition from Arbitrary View Facial Images , 2010, ECCV.

[22]  Trevor Darrell,et al.  Discriminative Gaussian process latent variable model for classification , 2007, ICML '07.

[23]  J. Shawe-Taylor,et al.  Multi-View Canonical Correlation Analysis , 2010 .

[24]  Maja Pantic,et al.  Shared Gaussian Process Latent Variable Model for Multi-view Facial Expression Recognition , 2013, ISVC.

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Raquel Urtasun,et al.  Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation , 2010, NIPS.

[27]  Lijun Yin,et al.  A study of non-frontal-view facial expressions recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[28]  Ville Ojansivu,et al.  Blur Insensitive Texture Classification Using Local Phase Quantization , 2008, ICISP.

[29]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[31]  S. Sundararajan,et al.  Predictive Approaches for Choosing Hyperparameters in Gaussian Processes , 1999, Neural Computation.

[32]  WenAn Tan,et al.  Gabor feature-based face recognition using supervised locality preserving projection , 2007, Signal Process..

[33]  P. Ekman,et al.  Unmasking the face : a guide to recognizing emotions from facial clues , 1975 .

[34]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[35]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[36]  Hazim Kemal Ekenel,et al.  Multi-view facial expression recognition using local appearance features , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[37]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[38]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[39]  C. Ek Shared Gaussian Process Latent Variables Models , 2009 .

[40]  Jianmin Zhao,et al.  Gabor Feature Based Face Recognition Using Supervised Locality Preserving Projection , 2006, ACIVS.

[41]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[42]  Alex Pentland,et al.  Human-Centred Intelligent Human-Computer Interaction (HCI2): how far are we from attaining it? , 2008, Int. J. Auton. Adapt. Commun. Syst..

[43]  Richard Bowden,et al.  Local binary patterns for multi-view facial expression recognition , 2011 .

[44]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[45]  Zhiwei Zhu,et al.  Robust Real-Time Face Pose and Facial Expression Recovery , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[46]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[47]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[48]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[49]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  H. Künsch Gaussian Markov random fields , 1979 .

[51]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[52]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[53]  Joaquin Quiñonero Candela,et al.  Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[54]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[56]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.