Modeling Dynamics of Expressive Body Gestures In Dyadic Interactions

Body gestures are an important non-verbal expression channel during affective communication. They convey human attitudes and emotions as they dynamically unfold during an interpersonal interaction. Hence, it is highly desirable to understand the dynamics of body gestures associated with emotion expression in human interactions. We present a statistical framework for robustly modeling the dynamics of body gestures in dyadic interactions. Our framework is based on high-level semantic gesture patterns and consists of three components. First, we construct a universal background model (UBM) using Gaussian mixture modeling (GMM) to represent subject-independent gesture variability. Next, we describe each gesture sequence as a concatenation of semantic gesture patterns which are derived from a parallel HMM structure. Then, we probabilistically compare the segments of each gesture sequence extracted from the second step with the UBM obtained from the first step, in order to select highly probabilistic gesture patterns for the sequence. The dynamics of each gesture sequence are represented by a statistical variation profile computed from the selected patterns, and are further described in a well-defined kernel space. This framework is compared with three baseline models and is evaluated in emotion recognition experiments, i.e., recognizing the overall emotional state of a participant in a dyadic interaction from the gesture dynamics. The recognition performance demonstrates the superiority of the proposed framework over the baseline models. The analysis of the relationship between the emotion recognition performance and the number of the selected segments also indicates that a few local salient events, rather than the whole gesture sequence, are sufficiently informative to trigger the human summarization of their overall global emotion perception.

[1]  Antonio Camurri,et al.  Technique for automatic emotion recognition by body gesture analysis , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Albert Mehrabian,et al.  Encoding of attitude by a seated communicator via posture and position cues. , 1969 .

[3]  A. Kendon Gesticulation and Speech: Two Aspects of the Process of Utterance , 1981 .

[4]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[5]  H. Wallbott Bodily expression of emotion , 1998 .

[6]  Nadia Bianchi-Berthouze,et al.  Automatic Recognition of Affective Body Movement in a Video Game Scenario , 2011, INTETAIN.

[7]  Björn W. Schuller,et al.  Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks , 2009, INTERSPEECH.

[8]  A. Murat Tekalp,et al.  Learn2Dance: Learning Statistical Music-to-Dance Mappings for Choreography Synthesis , 2012, IEEE Transactions on Multimedia.

[9]  K. Lindahl,et al.  Methodological issues in family observational research. , 2000 .

[10]  Laurens van der Maaten,et al.  Learning a Parametric Embedding by Preserving Local Structure , 2009, AISTATS.

[11]  Douglas A. Reynolds,et al.  An overview of automatic speaker recognition technology , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Athanasios Katsamanis,et al.  Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information , 2013, Image Vis. Comput..

[13]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[14]  M. D. Meijer The contribution of general features of body movement to the attribution of emotions , 1989 .

[15]  Julia Hirschberg,et al.  Entrainment in Speech Preceding Backchannels. , 2011, ACL.

[16]  Carlos Busso,et al.  The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations , 2015, Language Resources and Evaluation.

[17]  Peter Robinson,et al.  Detecting Emotions from Connected Action Sequences , 2009, IVIC.

[18]  Antonio Camurri,et al.  Multimodal Analysis of Expressive Gesture in Music and Dance Performances , 2003, Gesture Workshop.

[19]  Sergey Levine,et al.  Real-time prosody-driven synthesis of body language , 2009, SIGGRAPH 2009.

[20]  Patricia K. Kerig,et al.  Family Observational Coding Systems : Resources for Systemic Research , 2001 .

[21]  Willem J. M. Levelt,et al.  Gesture and the communicative intention of the speaker , 2005 .

[22]  Antonio Ortega,et al.  Gesture dynamics modeling for attitude analysis using graph based transform , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[23]  Hatice Gunes,et al.  Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space , 2011, IEEE Transactions on Affective Computing.

[24]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  K. Scherer,et al.  The New Handbook of Methods in Nonverbal Behavior Research , 2008 .

[26]  Panayiotis G. Georgiou,et al.  Head Motion Modeling for Human Behavior Analysis in Dyadic Interaction , 2015, IEEE Transactions on Multimedia.

[27]  Panayiotis G. Georgiou,et al.  Modeling head motion entrainment for prediction of couples' behavioral characteristics , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[28]  G. Humphrey The Psychology of the Gestalt. , 1924 .

[29]  Jernej Barbic,et al.  Segmenting Motion Capture Data into Distinct Behaviors , 2004, Graphics Interface.

[30]  Steve Young,et al.  The HTK book , 1995 .

[31]  Cristina Conati,et al.  Probabilistic assessment of user's emotions in educational games , 2002, Appl. Artif. Intell..

[32]  Horst Bunke,et al.  On the influence of vocabulary size and language models in unconstrained handwritten text recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[33]  Nicholas Ayache,et al.  Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices , 2007, SIAM J. Matrix Anal. Appl..

[34]  Carlos Busso,et al.  Exploring Cross-Modality Affective Reactions for Audiovisual Emotion Recognition , 2013, IEEE Transactions on Affective Computing.

[35]  Shrikanth Narayanan,et al.  The USC Creative IT Database: A Multimodal Database of Theatrical Improvisation , 2010 .

[36]  Jessica K. Hodgins,et al.  Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Ana Paiva,et al.  Automatic analysis of affective postures and body motion to detect engagement with a game companion , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[38]  Antonio Camurri,et al.  Toward a Minimal Representation of Affective Gestures , 2011, IEEE Transactions on Affective Computing.

[39]  A. Murat Tekalp,et al.  Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Björn W. Schuller,et al.  Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification , 2012, IEEE Transactions on Affective Computing.

[41]  Gang Hua,et al.  Probabilistic Elastic Matching for Pose Variant Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Michelle Karg,et al.  Recognition of Affect Based on Gait Patterns , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Athanasios Katsamanis,et al.  Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions , 2014, Comput. Speech Lang..

[44]  Peter Robinson,et al.  Detecting Affect from Non-stylised Body Motions , 2007, ACII.

[45]  Yael Edan,et al.  Vision-based hand-gesture applications , 2011, Commun. ACM.

[46]  Angeliki Metallinou,et al.  Analysis of interaction attitudes using data-driven hand gesture phrases , 2014, IEEE International Conference on Acoustics, Speech, and Signal Processing.

[47]  H. Meeren,et al.  Rapid perceptual integration of facial expression and emotional body language. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Peter F. Driessen,et al.  Gesture-Based Affective Computing on Motion Capture Data , 2005, ACII.

[49]  D. McNeill Gesture and Thought , 2005 .

[50]  Andrea Kleinsmith,et al.  Affective Body Expression Perception and Recognition: A Survey , 2013, IEEE Transactions on Affective Computing.

[51]  Sharon Marie Carnicke,et al.  Stanislavsky in Focus: An Acting Master for the Twenty-First Century , 1998 .