Biologically Inspired Facial Emotion Recognition

When facial emotion recognition is performed in unconstrained settings, humans outperform state-of-the-art algorithms. The major technical problems of a state-of-the-art system are that: (1) they attempt to use all of the frames in the training data to build a training model, even frames that are redundant or are not necessary to describe a person's emotions. (2) If the system is using a Gabor filter as a facial feature descriptor, it captures noise due to background texture as important edge information when this information is erroneous, and the amount of computer memory required to describe the faces using the Gabor filter is undesireable high. (3) Most of the current algorithms do not generalize to unconstrained data because each person expresses his/her emotions in different ways, and the persons in the testing data are not the same persons encountered in the training data. In these situations, current approaches perform inadequately because models developed from training data cannot properly predict the emotions of unseen testing samples. We address each of these three problems by presenting systems that are based on the human visual system. The first system, called vision and attention theory, downsamples the training and testing data temporally to reduce the memory cost. The second system, called background-suppressing Gabor filtering, represented the face in the same way the human visual system's non-classical receptive field represents a face to overcome background texture. The third system, called score-based facial emotion recognition, scores a frontal face image's relationship to references of a face and temporal. We thoroughly test all systems on four different, publicly available datasets: the Japanese Female Facial Expression Database, Cohn-Kanade+, Man-Machine Interface and the Audio/Visual Emotion Challenge. We find that our systems, which emulated the human visual system, perform better than state-of-the-art systems. This work shows promise for the detection of facial emotion in unconstrained settings.

[1]  Arman Savran,et al.  Combining video, audio and lexical indicators of affect in spontaneous conversation via particle filtering , 2012, ICMI '12.

[2]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[4]  R. Haber,et al.  The psychology of visual perception , 1973 .

[5]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[6]  Ninad Thakoor,et al.  Facial emotion recognition with anisotropic inhibited Gabor energy histograms , 2013, 2013 IEEE International Conference on Image Processing.

[7]  Mohamed S. Kamel,et al.  Cross-Domain Facial Expression Recognition Using Supervised Kernel Mean Matching , 2012, 2012 11th International Conference on Machine Learning and Applications.

[8]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[9]  Arun Ross,et al.  Score normalization in multimodal biometric systems , 2005, Pattern Recognit..

[10]  C. Darwin The Expression of the Emotions in Man and Animals , .

[11]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[12]  Mohamed Chetouani,et al.  Robust continuous prediction of human emotions using multiscale dynamic cues , 2012, ICMI '12.

[13]  P. Robinson,et al.  The emotional hearing aid: an assistive tool for children with Asperger syndrome , 2005, Universal Access in the Information Society.

[14]  Laurens van der Maaten Audio-visual emotion challenge 2012: a simple approach , 2012, ICMI '12.

[15]  K. Scherer,et al.  The World of Emotions is not Two-Dimensional , 2007, Psychological science.

[16]  Nicolai Petkov,et al.  Contour detection based on nonclassical receptive field inhibition , 2003, IEEE Trans. Image Process..

[17]  Ville Ojansivu,et al.  Blur Insensitive Texture Classification Using Local Phase Quantization , 2008, ICISP.

[18]  Maja Pantic,et al.  Meta-Analysis of the First Facial Expression Recognition Challenge , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Maja Pantic,et al.  Fully automatic facial feature point detection using Gabor feature based boosted classifiers , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[20]  Marc M. Van Hulle,et al.  A phase-based approach to the estimation of the optical flow field using spatial filtering , 2002, IEEE Trans. Neural Networks.

[21]  Tal Hassner,et al.  Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Jean Meunier,et al.  Continuous Emotion Recognition Using Gabor Energy Filters , 2011, ACII.

[24]  Marian Stewart Bartlett,et al.  Action unit recognition transfer across datasets , 2011, Face and Gesture 2011.

[25]  Ninad Thakoor,et al.  Facial emotion recognition in continuous video , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[26]  Maja Pantic,et al.  Facial Action Unit Detection using Probabilistic Actively Learned Support Vector Machines on Tracked Facial Point Data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[27]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Bir Bhanu,et al.  Evolutionary feature synthesis for facial expression recognition , 2006, Pattern Recognit. Lett..

[29]  Bir Bhanu,et al.  A Psychological Adaptive Model For Video Analysis , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[30]  Marina L. Gavrilova,et al.  Gauss–Laguerre wavelet textural feature fusion with geometrical information for facial expression identification , 2012, EURASIP J. Image Video Process..

[31]  Yaniv Taigman,et al.  Descriptor Based Methods in the Wild , 2008 .

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Maja Pantic,et al.  A Dynamic Texture-Based Approach to Recognition of Facial Actions and Their Temporal Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Marian Stewart Bartlett,et al.  Facial expression recognition using Gabor motion energy filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[35]  Bir Bhanu,et al.  Understanding Discrete Facial Expressions in Video Using an Emotion Avatar Image , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[36]  Björn W. Schuller,et al.  AVEC 2012: the continuous audio/visual emotion challenge , 2012, ICMI '12.

[37]  Khashayar Khorasani,et al.  Facial expression recognition using constructive feedforward neural networks , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Patrick J. Flynn,et al.  Face recognition in low-resolution videos using learning-based likelihood measurement model , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[39]  Erik Marchi,et al.  Asc-inclusion: Interactive Emotion Games for Social Inclusion of Children with Autism Spectrum Conditions , 2022 .

[40]  Simon Lucey,et al.  Registration Invariant Representations for Expression Detection , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[41]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Fernando De la Torre,et al.  Dynamic Cascades with Bidirectional Bootstrapping for Action Unit Detection in Spontaneous Facial Behavior , 2011, IEEE Transactions on Affective Computing.

[43]  Markus Kächele,et al.  Multiple Classifier Systems for the Classification of Audio-Visual Emotional States , 2011, ACII.

[44]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[45]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[46]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Thomas S. Huang,et al.  Emotion Recognition from Arbitrary View Facial Images , 2010, ECCV.

[48]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[49]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[50]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[51]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[52]  Maja Pantic,et al.  A Dynamic Appearance Descriptor Approach to Facial Actions Temporal Modeling , 2014, IEEE Transactions on Cybernetics.

[53]  Matti Pietikäinen,et al.  Spatiotemporal Local Monogenic Binary Patterns for Facial Expression Recognition , 2012, IEEE Signal Processing Letters.