Affect burst detection using multi-modal cues

Recently, affect bursts have gained significant importance in the field of emotion recognition since they can serve as prior in recognising underlying affect bursts. In this paper we propose a data driven approach for detecting affect bursts using multimodal streams of input such as audio and facial landmark points. The proposed Gaussian Mixture Model based method learns each modality independently followed by combining the probabilistic outputs to form a decision. This gives us an edge over feature fusion based methods as it allows us to handle events when one of the modalities is too noisy or not available. We demonstrate robustness of the proposed approach on 'Interactive emotional dyadic motion capture database' (IEMOCAP) which contains realistic and natural dyadic conversations. This database is annotated by three annotators to segment and label affect bursts to be used for training and testing purposes. We also present performance comparison between SVM based methods and GMM based methods for the same configuration of experiments.

[1]  K. Scherer,et al.  Affect bursts: dynamic patterns of facial expression. , 2011, Emotion.

[2]  William Curran,et al.  Laughter Type Recognition from Whole Body Motion , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[3]  Qiang Ji,et al.  2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, ACII 2013, Geneva, Switzerland, September 2-5, 2013 , 2013, ACII.

[4]  Maja Pantic,et al.  Audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help , 2011, IEEE Transactions on Multimedia.

[5]  Maja Pantic,et al.  Decision-Level Fusion for Audio-Visual Laughter Detection , 2008, MLMI.

[6]  Tanja Schultz,et al.  Detection of Laughter-in-Interaction in Multichannel Close-Talk Microphone Recordings of Meetings , 2008, MLMI.

[7]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[8]  Björn W. Schuller,et al.  Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech , 2008, PIT.

[9]  Maja Pantic,et al.  Audiovisual Detection of Laughter in Human-Machine Interaction , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[10]  Engin Erzin,et al.  Affect burst recognition using multi-modal cues , 2014, 2014 22nd Signal Processing and Communications Applications Conference (SIU).

[11]  Marc Schröder,et al.  Experimental study of affect bursts , 2003, Speech Commun..

[12]  Steve Young,et al.  The HTK book , 1995 .

[13]  B. N. Barman Laughing , Crying , Sneezing and Yawning : Automatic Voice Dr iven Animation of Non-Speech Articulations ∗ , 2006 .

[14]  David A. van Leeuwen,et al.  Automatic discrimination between laughter and speech , 2007, Speech Commun..