Vision and Attention Theory Based Sampling for Continuous Facial Emotion Recognition

Affective computing-the emergent field in which computers detect emotions and project appropriate expressions of their own-has reached a bottleneck where algorithms are not able to infer a person's emotions from natural and spontaneous facial expressions captured in video. While the field of emotion recognition has seen many advances in the past decade, a facial emotion recognition approach has not yet been revealed which performs well in unconstrained settings. In this paper, we propose a principled method which addresses the temporal dynamics of facial emotions and expressions in video with a sampling approach inspired from human perceptual psychology. We test the efficacy of the method on the Audio/Visual Emotion Challenge 2011 and 2012, CohnKanade and the MMI Facial Expression Database. The method shows an average improvement of 9.8 percent over the baseline for weighted accuracy on the Audio/Visual Emotion Challenge 2011 video-based frame-level subchallenge testing set.

[1]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[3]  Louis-Philippe Morency,et al.  Step-wise emotion recognition using concatenated-HMM , 2012, ICMI '12.

[4]  Arun Ross,et al.  Score normalization in multimodal biometric systems , 2005, Pattern Recognit..

[5]  Jean Meunier,et al.  Continuous Emotion Recognition Using Gabor Energy Filters , 2011, ACII.

[6]  Mohamed Chetouani,et al.  Robust continuous prediction of human emotions using multiscale dynamic cues , 2012, ICMI '12.

[7]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[8]  Nadia Bianchi-Berthouze,et al.  Naturalistic Affective Expression Classification by a Multi-stage Approach Based on Hidden Markov Models , 2011, ACII.

[9]  R. Haber,et al.  The psychology of visual perception , 1973 .

[10]  Maja Pantic,et al.  Fully automatic facial feature point detection using Gabor feature based boosted classifiers , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[11]  J. Findlay,et al.  Active Vision: The Psychology of Looking and Seeing , 2003 .

[12]  Payam Saisan,et al.  Dynamic texture recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  K. Scherer,et al.  The World of Emotions is not Two-Dimensional , 2007, Psychological science.

[14]  M. Pantic,et al.  Induced Disgust , Happiness and Surprise : an Addition to the MMI Facial Expression Database , 2010 .

[15]  Gwen Littlewort,et al.  Automatic Recognition of Facial Actions in Spontaneous Expressions , 2006, J. Multim..

[16]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[18]  Markus Kächele,et al.  Multiple Classifier Systems for the Classification of Audio-Visual Emotional States , 2011, ACII.

[19]  Ephraim P. Glinert,et al.  Multimodal Interaction , 1996, IEEE Multim..

[20]  Catherine Pelachaud,et al.  A multimodal fuzzy inference system using a continuous facial expression representation for emotion detection , 2012, ICMI '12.

[21]  Fernando De la Torre,et al.  Dynamic Cascades with Bidirectional Bootstrapping for Action Unit Detection in Spontaneous Facial Behavior , 2011, IEEE Transactions on Affective Computing.

[22]  P. Subramanian Active Vision: The Psychology of Looking and Seeing , 2006 .

[23]  Simon Lucey,et al.  Registration Invariant Representations for Expression Detection , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[24]  Marc M. Van Hulle,et al.  A phase-based approach to the estimation of the optical flow field using spatial filtering , 2002, IEEE Trans. Neural Networks.

[25]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[26]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Maja Pantic,et al.  Fully Automatic Facial Action Unit Detection and Temporal Analysis , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[28]  Matti Pietikäinen,et al.  Local Binary Patterns , 2010, Scholarpedia.

[29]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[30]  Ninad Thakoor,et al.  Facial emotion recognition in continuous video , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[31]  Maja Pantic,et al.  Facial Action Unit Detection using Probabilistic Actively Learned Support Vector Machines on Tracked Facial Point Data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[32]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Bir Bhanu,et al.  Evolutionary feature synthesis for facial expression recognition , 2006, Pattern Recognit. Lett..

[34]  Maja Pantic,et al.  Meta-Analysis of the First Facial Expression Recognition Challenge , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[35]  Laurens van der Maaten Audio-visual emotion challenge 2012: a simple approach , 2012, ICMI '12.

[36]  Maja Pantic,et al.  The first facial expression recognition and analysis challenge , 2011, Face and Gesture 2011.

[37]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[38]  Maja Pantic,et al.  A Dynamic Texture-Based Approach to Recognition of Facial Actions and Their Temporal Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Marian Stewart Bartlett,et al.  Facial expression recognition using Gabor motion energy filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[40]  Bir Bhanu,et al.  Understanding Discrete Facial Expressions in Video Using an Emotion Avatar Image , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[41]  Björn W. Schuller,et al.  AVEC 2012: the continuous audio/visual emotion challenge , 2012, ICMI '12.

[42]  Maja Pantic,et al.  The SEMAINE corpus of emotionally coloured character interactions , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[43]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[44]  Maja Pantic,et al.  A Dynamic Appearance Descriptor Approach to Facial Actions Temporal Modeling , 2014, IEEE Transactions on Cybernetics.

[45]  Bir Bhanu,et al.  A Psychological Adaptive Model For Video Analysis , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[46]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Arman Savran,et al.  Combining video, audio and lexical indicators of affect in spontaneous conversation via particle filtering , 2012, ICMI '12.

[48]  Dirk Heylen,et al.  The Sensitive Artificial Listner: an induction technique for generating emotionally coloured conversation , 2008 .

[49]  Louis-Philippe Morency,et al.  Modeling Latent Discriminative Dynamic of Multi-dimensional Affective Signals , 2011, ACII.

[50]  Roddy Cowie,et al.  AVEC 2012: the continuous audio/visual emotion challenge - an introduction , 2012, ICMI.

[51]  Alan Yuille,et al.  Active Vision , 2014, Computer Vision, A Reference Guide.