Event based emotion recognition for realistic non-acted speech

Estimating emotion from speech is an active and ongoing area of research, however most of the literature addresses acted speech and not natural day to day conversational speech. Identifying emotion from the latter is difficult because the emotion expressed by non-actors is not necessarily prominent. In this paper we validate the hypothesis, which is based on the observations that human annotators show large inter and intra person variations in annotating emotions expressed in realistic speech as compared to the acted speech. We propose a method to recognize emotions using the knowledge of events in an interactive voice response setup. The main contribution of the paper is the use of event based knowledge to enhance the identification of emotions in real natural speech.

[1]  Sunil Kumar Kopparapu Non-Linguistic Analysis of Call Center Conversations , 2014 .

[2]  Elmar Nöth,et al.  Tales of tuning - prototyping for automatic classification of emotional user states , 2005, INTERSPEECH.

[3]  Laurence Devillers,et al.  Five emotion classes detection in real-world call center data : the use of various types of paralinguistic features , 2007 .

[4]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[5]  John H. L. Hansen,et al.  Getting started with SUSAS: a speech under simulated and actual stress database , 1997, EUROSPEECH.

[6]  N. Allen,et al.  Emotion Recognition in Spontaneous Speech within Work and Family Environments , 2009, 2009 3rd International Conference on Bioinformatics and Biomedical Engineering.

[7]  Maja J. Mataric,et al.  A Framework for Automatic Human Emotion Classification Using Emotion Profiles , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Carlos Busso,et al.  Evaluation of syllable rate estimation in expressive speech and its contribution to emotion recognition , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[9]  Chitralekha Bhat,et al.  Deploying usable speech enabled IVR systems for mass use , 2013, 2013 International Conference on Human Computer Interactions (ICHCI).

[10]  Björn W. Schuller,et al.  Towards More Reality in the Recognition of Emotional Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[12]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[13]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[14]  Sarah Jane Delany,et al.  Benchmarking classification models for emotion recognition in natural speech: A multi-corporal study , 2011, Face and Gesture 2011.

[15]  Tiago H. Falk,et al.  Automatic speech emotion recognition using modulation spectral features , 2011, Speech Commun..

[16]  N. Allen,et al.  Regulation of Negative Affect During Mother–Child Problem-Solving Interactions: Adolescent Depressive Status and Family Processes , 2000, Journal of abnormal child psychology.

[17]  Dimitrios Ververidis,et al.  A State of the Art Review on Emotional Speech Databases , 2003 .

[18]  Kornel Laskowski,et al.  Emotion recognition in spontaneous speech using GMMs , 2006, INTERSPEECH.