Emotion Sensing From Head Motion Capture

Computational analysis of emotion from verbal and non-verbal behavioral cues is critical for human-centric intelligent systems. Among the non-verbal cues, head motion has received relatively less attention, although its importance has been noted in several research. We propose a new approach for emotion recognition using head motion captured using Motion Capture (MoCap). Our approach is motivated by the well known kinesics-phonetic analogy, which advocates that, analogous to human speech being composed of phonemes, head motion is composed of kinemes i.e., elementary motion units. We discover a set of kinemes from head motion in an unsupervised manner by projecting them onto a learned basis domain and subsequently clustering them. This transforms any head motion to a sequence of kinemes. Next, we learn the temporal latent structures within the kineme sequence pertaining to each emotion. For this purpose, we explore two separate approaches – one using Hidden Markov Model and another using neural network. This class-specific, kineme-based representation of head motion is used to perform emotion recognition on the popular IEMOCAP database. We achieve high recognition accuracy (61.8% for three class) for various emotion recognition tasks using head motion alone. This work adds to our understanding of head motion dynamics, and has applications in emotion analysis and head motion animation and synthesis.

[1]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[2]  Caroline Palmer,et al.  Head movements encode emotions during speech and song. , 2016, Emotion.

[3]  Shrikanth S. Narayanan,et al.  Modeling Dynamics of Expressive Body Gestures In Dyadic Interactions , 2017, IEEE Transactions on Affective Computing.

[4]  Tanaya Guha,et al.  Multimodal Prediction of Affective Dimensions and Depression in Human-Computer Interactions , 2014, AVEC '14.

[5]  Gang Wang,et al.  Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Seyedmahdad Mirsamadi,et al.  Automatic speech emotion recognition using recurrent neural networks with local attention , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Peter Robinson,et al.  Decoupling facial expressions and head motions in complex emotions , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[8]  Tanaya Guha,et al.  On the role of head motion in affective expression , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Andrea Cavallaro,et al.  Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Theerawit Wilaiprasitporn,et al.  Consumer Grade EEG Measuring Sensors as Research Tools: A Review , 2020, IEEE Sensors Journal.

[12]  Cynthia Breazeal,et al.  Affective Personalization of a Social Robot Tutor for Children's Second Language Skills , 2016, AAAI.

[13]  S. Marsella,et al.  Expressing Emotion Through Posture and Gesture , 2015 .

[14]  Jeffrey F. Cohn,et al.  What can head and facial movements convey about positive and negative affect? , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[15]  Emily Mower Provost,et al.  Leveraging inter-rater agreement for audio-visual emotion recognition , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[16]  Zhigang Deng,et al.  Low-Level Characterization of Expressive Head Motion Through Frequency Domain Analysis , 2020, IEEE Transactions on Affective Computing.

[17]  Elizabeth A. Crane,et al.  Methodology for Assessing Bodily Expression of Emotion , 2010 .

[18]  Peng Liu,et al.  Spontaneous facial expression analysis based on temperature changes and head motions , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[19]  Tanaya Guha,et al.  Learning Spontaneity to Improve Emotion Recognition In Speech , 2018, INTERSPEECH.

[20]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[21]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[22]  Qiang Ji,et al.  Facial Action Unit Recognition by Exploiting Their Dynamic and Semantic Relationships , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Hatice Gunes,et al.  Dimensional Emotion Prediction from Spontaneous Head Gestures for Interaction with Sensitive Artificial Listeners , 2010, IVA.

[24]  P. Ekman Differential communication of affect by head and body cues. , 1965, Journal of personality and social psychology.

[25]  R. Birdwhistell Kinesics and Context: Essays on Body Motion Communication , 1971 .

[26]  Panayiotis G. Georgiou,et al.  Head Motion Modeling for Human Behavior Analysis in Dyadic Interaction , 2015, IEEE Transactions on Multimedia.

[27]  Michael Wagner,et al.  Multimodal Depression Detection: Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors , 2018, IEEE Transactions on Affective Computing.

[28]  Theerawit Wilaiprasitporn,et al.  Consumer Grade Brain Sensing for Emotion Recognition , 2019, IEEE Sensors Journal.

[29]  Daniel S. Messinger,et al.  Head Movement Dynamics during Play and Perturbed Mother-Infant Interaction , 2015, IEEE Transactions on Affective Computing.

[30]  Sergio Escalera,et al.  Survey on Emotional Body Gesture Recognition , 2018, IEEE Transactions on Affective Computing.