Emotion recognition from multi-modal information

Emotion recognition is the ability to detect what people are feeling from moment to moment and to understand the connection between their feelings and verbal/non-verbal expressions. When you are aware of your emotions, you can think clearly and creatively, manage stress and challenges, communicate well with others, and display trust, empathy, and confidence. In today's world, human-computer interaction (HCI) interface undoubtedly plays an important role in our daily life. Toward harmonious HCI interface, automated analysis of human emotion has attracted increasing attention from the researchers in multidisciplinary research fields. In this paper, we presents a survey on theoretical and practical work offering new and broad views of the latest research in emotion recognition from multi-modal information including facial and vocal expressions. A variety of theoretical background and applications ranging from salient emotional features, emotional-cognitive models, to multi-modal data fusion strategies is surveyed for emotion recognition on these modalities. Conclusions outline some of the existing emotion recognition challenges.

[1]  Mohammad H. Mahoor,et al.  Facial action unit recognition with sparse representation , 2011, Face and Gesture 2011.

[2]  Katherine B. Martin,et al.  Facial Action Coding System , 2015 .

[3]  Chung-Hsien Wu,et al.  Emotion Recognition of Affective Speech Based on Multiple Classifiers Using Acoustic-Prosodic Information and Semantic Labels , 2015, IEEE Transactions on Affective Computing.

[4]  Zhihong Zeng,et al.  Audio–Visual Affective Expression Recognition Through Multistream Fused HMM , 2008, IEEE Transactions on Multimedia.

[5]  Peter W. McOwan,et al.  A real-time automated system for the recognition of human facial expressions , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Hatice Gunes,et al.  Automatic, Dimensional and Continuous Emotion Recognition , 2010, Int. J. Synth. Emot..

[7]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[8]  Angeliki Metallinou,et al.  Audio-Visual Emotion Recognition Using Gaussian Mixture Models for Face and Voice , 2008, 2008 Tenth IEEE International Symposium on Multimedia.

[9]  Qiang Ji,et al.  Simultaneous Facial Feature Tracking and Facial Expression Recognition , 2013, IEEE Transactions on Image Processing.

[10]  Dongrui Wu,et al.  Speech emotion estimation in 3D space , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[11]  WuChung-Hsien,et al.  Error Weighted Semi-Coupled Hidden Markov Model for Audio-Visual Emotion Recognition , 2012 .

[12]  P. Ekman Facial expression and emotion. , 1993, The American psychologist.

[13]  Björn W. Schuller,et al.  Audiovisual recognition of spontaneous interest within conversations , 2007, ICMI '07.

[14]  Shashidhar G. Koolagudi,et al.  Speech Emotion Recognition Using Segmental Level Prosodic Analysis , 2011, 2011 International Conference on Devices and Communications (ICDeCom).

[15]  Chung-Hsien Wu,et al.  Emotion recognition of conversational affective speech using temporal course modeling-based error weighted cross-correlation model , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.

[16]  Maja Pantic,et al.  Automatic Analysis of Facial Expressions: The State of the Art , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  R. Thayer The biopsychology of mood and arousal , 1989 .

[18]  Chung-Hsien Wu,et al.  Facial action unit prediction under partial occlusion based on Error Weighted Cross-Correlation Model , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Zhihong Zeng,et al.  Audio-Visual Affect Recognition , 2007, IEEE Transactions on Multimedia.

[20]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[21]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Qiang Ji,et al.  A Unified Probabilistic Framework for Spontaneous Facial Action Modeling and Understanding , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[24]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[25]  Maja Pantic,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING , 2022 .

[26]  Qiang Ji,et al.  Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Rosalind W. Picard Affective Computing , 1997 .

[28]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[29]  Chung-Hsien Wu,et al.  Emotion Perception and Recognition from Speech , 2009, Affective Information Processing.

[30]  J. Russell A circumplex model of affect. , 1980 .

[31]  Rong Yan,et al.  Joint Emotion-Topic Modeling for Social Affective Text Mining , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[32]  WuChung-Hsien,et al.  Emotion recognition from text using semantic labels and separable mixture models , 2006 .

[33]  Hatice Gunes,et al.  A multi-layer hybrid framework for dimensional emotion classification , 2011, ACM Multimedia.

[34]  Chung-Hsien Wu,et al.  Two-Level Hierarchical Alignment for Semi-Coupled HMM-Based Audiovisual Emotion Recognition With Temporal Course , 2013, IEEE Transactions on Multimedia.

[35]  Ruili Wang,et al.  Ensemble methods for spoken emotion recognition in call-centres , 2007, Speech Commun..

[36]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[37]  Guodong Guo,et al.  Learning from examples in the small sample case: face expression recognition , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Zhihong Zeng,et al.  Audio-visual affect recognition in activation-evaluation space , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[39]  Jiucang Hao,et al.  Emotion recognition by speech signals , 2003, INTERSPEECH.

[40]  Björn W. Schuller,et al.  Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification , 2012, IEEE Transactions on Affective Computing.

[41]  Maja Pantic,et al.  A Dynamic Texture-Based Approach to Recognition of Facial Actions and Their Temporal Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Chung-Hsien Wu,et al.  Error Weighted Semi-Coupled Hidden Markov Model for Audio-Visual Emotion Recognition , 2012, IEEE Transactions on Multimedia.

[43]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[44]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[45]  M. Bartlett,et al.  Machine Analysis of Facial Expressions , 2007 .

[46]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[47]  Chung-Hsien Wu,et al.  Speaking Effect Removal on Emotion Recognition From Facial Expressions Based on Eigenface Conversion , 2013, IEEE Transactions on Multimedia.

[48]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[49]  Björn W. Schuller,et al.  Timing levels in segment-based speech emotion recognition , 2006, INTERSPEECH.

[50]  Ragini Verma,et al.  Class-level spectral features for emotion recognition , 2010, Speech Commun..

[51]  Oh-Wook Kwon,et al.  EMOTION RECOGNITION BY SPEECH SIGNAL , 2003 .

[52]  Rui Xia,et al.  Sentence level emotion recognition based on decisions from subsentence segments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[53]  Maja Pantic,et al.  A Dynamic Appearance Descriptor Approach to Facial Actions Temporal Modeling , 2014, IEEE Transactions on Cybernetics.

[54]  Roddy Cowie,et al.  Emotional speech: Towards a new generation of databases , 2003, Speech Commun..

[55]  Chun Chen,et al.  A robust multimodal approach for emotion recognition , 2008, Neurocomputing.

[56]  Angeliki Metallinou,et al.  Decision level combination of multiple modalities for recognition and analysis of emotional expression , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[57]  Jacob Whitehill,et al.  Haar features for FACS AU recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[58]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[59]  Qinghua Zheng,et al.  Can e-Learner's emotion be recognized from interactive Chinese texts? , 2009, 2009 13th International Conference on Computer Supported Cooperative Work in Design.

[60]  Nikos Fakotakis,et al.  Modeling the Temporal Evolution of Acoustic Parameters for Speech Emotion Recognition , 2012, IEEE Transactions on Affective Computing.

[61]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[62]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[63]  Maja Pantic,et al.  Coupled Gaussian processes for pose-invariant facial expression recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.