Exemplar Hidden Markov Models for classification of facial expressions in videos

Facial expressions are dynamic events comprised of meaningful temporal segments. A common approach to facial expression recognition in video is to first convert variable-length expression sequences into a vector representation by computing summary statistics of image-level features or of spatio-temporal features. These representations are then passed to a discriminative classifier such as a support vector machines (SVM). However, these approaches don't fully exploit the temporal dynamics of facial expressions. Hidden Markov Models (HMMs), provide a method for modeling variable-length expression time-series. Although HMMs have been explored in the past for expression classification, they are rarely used since classification performance is often lower than discriminative approaches, which may be attributed to the challenges of estimating generative models. This paper explores an approach for combining the modeling strength of HMMs with the discriminative power of SVMs via a model-based similarity framework. Each example is first instantiated into an Exemplar-HMM model. A probabilistic kernel is then used to compute a kernel matrix, to be used along with an SVM classifier. This paper proposes that dynamical models such as HMMs are advantageous for the facial expression problem space, when employed in a discriminative, exemplar-based classification framework. The approach yields state-of-the-art results on both posed (CK+ and OULU-CASIA) and spontaneous (FEEDTUM and AM-FED) expression datasets highlighting the performance advantages of the approach.

[1]  Josep Lladós,et al.  A similarity measure between vector sequences with application to handwritten word image retrieval , 2009, CVPR.

[2]  Lifeng Shang,et al.  Nonparametric discriminant HMM and application to facial expression recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[4]  Fernando De la Torre,et al.  Facial Expression Analysis , 2011, Visual Analysis of Humans.

[5]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[6]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[8]  Qingshan Liu,et al.  Facial expression recognition using encoded dynamic features , 2007, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Daniel McDuff,et al.  Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected "In-the-Wild" , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Rana El Kaliouby,et al.  Automatic measurement of ad preferences from facial responses gathered over the Internet , 2014, Image Vis. Comput..

[11]  Ayoub Al-Hamadi,et al.  The effectiveness of using geometrical features for facial expression recognition , 2013, 2013 IEEE International Conference on Cybernetics (CYBCO).

[12]  José A. Rodríguez-Serrano,et al.  A similarity measure between vector sequences with application to handwritten word image retrieval , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[15]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[16]  Marian Stewart Bartlett,et al.  Classification and weakly supervised pain localization using multiple segment representation , 2014, Image Vis. Comput..

[17]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[18]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Zhengyou Zhang,et al.  Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[20]  Qiang Ji,et al.  Capturing Complex Spatio-temporal Relations among Facial Muscles for Facial Expression Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[22]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[23]  András Lörincz,et al.  High quality facial expression recognition in video streams using shape related information only , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[24]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[27]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[28]  Marian Stewart Bartlett,et al.  Exploring Bag of Words Architectures in the Facial Expression Domain , 2012, ECCV Workshops.

[29]  Mário A. T. Figueiredo,et al.  Similarity-Based Clustering of Sequences Using Hidden Markov Models , 2003, MLDM.

[30]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..