Multimodal mixed emotion detection

This paper presents a method to automatically detect emotional duality and mixed emotional experience using multimodal audio-visual continuous data. Co-ordinates, distance and movement of tracked points were used to create features from visual input that captured facial expressions, head, hand gestures and body movement. Spectral features, prosodic features were extracted from the audio channel. Audio-visual data along with depth information was recorded using the infrared sensor (Kinect). OpenEar toolkit and Face API was used for calculation of features. A combined feature vector was created by feature level fusion and a support vector machine (SVM) based classifier was used for emotion detection. 6 participants and 15 actions were used for recording simultaneous mixed emotional experience. The results showed that concurrent emotions can be automatically detected using multiple modalities. The overall accuracy using multimodal mixed emotion recognition was 96.6%. The accuracy from facial expressions (92.4%) and head movement (94.3%) was better compared to accuracies obtained from hand gesture (77.5%) and body movement (65.2%) alone.

[1]  L. Rothkrantz,et al.  Toward an affect-sensitive multimodal human-computer interaction , 2003, Proc. IEEE.

[2]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2009, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Gerald M. Knapp,et al.  Augmenting Supervised Emotion Recognition with Rule-Based Decision Model , 2016, ArXiv.

[4]  Gerald M. Knapp,et al.  Affect Intensity Estimation Using Multiple Modalities , 2014, FLAIRS.

[5]  Gerald M. Knapp,et al.  Multimodal Affect Analysis for Product Feedback Assessment , 2017, ArXiv.

[6]  Björn W. Schuller,et al.  OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[7]  Ashish Kapoor,et al.  Multimodal affect recognition in learning environments , 2005, ACM Multimedia.

[8]  K. Scherer,et al.  Multimodal expression of emotion: affect programs or componential appraisal patterns? , 2007, Emotion.

[9]  Zhigang Deng,et al.  Analysis of emotion recognition using facial expressions, speech and multimodal information , 2004, ICMI '04.

[10]  Eliot R. Smith,et al.  It’s about time: Intergroup emotions as time-dependent phenomena , 2016 .

[11]  Eugen Lupu,et al.  Bimodal approach in emotion recognition using speech and facial expressions , 2009, 2009 International Symposium on Signals, Circuits and Systems.

[12]  Nicu Sebe,et al.  MULTIMODAL EMOTION RECOGNITION , 2005 .

[13]  James J Gross,et al.  Same situation--different emotions: how appraisals shape our emotions. , 2007, Emotion.

[14]  Yoshihisa Kashima,et al.  Multiple emotions: a person-centered approach to the relationship between intergroup emotion and action orientation. , 2014, Emotion.

[15]  Jeff T. Larsen,et al.  Further evidence for mixed emotions. , 2011, Journal of personality and social psychology.

[16]  K. Scherer,et al.  Emotion recognition from expressions in face, voice, and body: the Multimodal Emotion Recognition Test (MERT). , 2009, Emotion.

[17]  Arthur C. Graesser,et al.  Multimodal semi-automated affect detection from conversational cues, gross body language, and facial features , 2010, User Modeling and User-Adapted Interaction.

[18]  Mann Oo. Hay Emotion recognition in human-computer interaction , 2012 .

[19]  Loïc Kessous,et al.  Multimodal emotion recognition from expressive faces, body gestures and speech , 2007, AIAI.

[20]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[21]  Jennifer Aaker,et al.  Can Mixed Emotions Peacefully Coexist? , 2000 .

[22]  Tsutomu Miyasato,et al.  Multimodal human emotion/expression recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.