Anger Detection in Arabic Speech Dialogs

Anger is potentially the most important human emotion to be detected in human-human dialogs, such as those found in call-centers and other similar fields. It directly measures the level of satisfaction of a speaker from his or her voice. Recently, many software applications were built as a result of the anger detection research work. In this paper, we design a framework to detect anger from spontaneous Arabic conversations. We construct a well-annotated corpus for anger and neutral emotion states from real-world Arabic speech dialogs for our experiments. The classification is based on acoustic sound features that are more appropriate for anger detection. Many acoustic features will be explored such as the fundamental frequency, formants, energy and Mel-frequency cepstral coefficients (MFCCs). Several classifiers are evaluated, and the experimental results show that support vector machine classifiers can yield more than 77% real-time anger detection rate.

[1]  Marc D. Pell,et al.  Implicit processing of emotional prosody in a foreign versus native language , 2008, Speech Commun..

[2]  Davood Gharavian,et al.  RECOGNITION OF EMOTIONAL SPEECH AND SPEECH EMOTION IN FARSI , 2006 .

[3]  Francesco Archetti,et al.  Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain , 2008, ICT4Justice.

[4]  M. Potegal,et al.  International handbook of anger : constituent and concomitant biological, psychological, and social processes , 2010 .

[5]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[6]  Shashidhar G. Koolagudi,et al.  Spectral Features for Emotion Classification , 2009, 2009 IEEE International Advance Computing Conference.

[7]  R. Stibbard AUTOMATED EXTRACTION OF ToBI ANNOTATION DATA FROM THE READING / LEEDS EMOTIONAL SPEECH CORPUS , 2000 .

[8]  Sonja A. Kotz,et al.  Factors in the recognition of vocally expressed emotions: A comparison of four languages , 2009, J. Phonetics.

[9]  Ruili Wang,et al.  Ensemble methods for spoken emotion recognition in call-centres , 2007, Speech Commun..

[10]  Shikha Tripathi,et al.  Emotion detection using perceptual based speech features , 2016, 2016 IEEE Annual India Conference (INDICON).

[11]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12]  L. T. Bosch EMOTIONS: WHAT IS POSSIBLE IN THE ASR FRAMEWORK , 2000 .

[13]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[14]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[15]  Sonja A. Kotz,et al.  Recognizing Emotions in a Foreign Language , 2009 .

[16]  Paavo Alku,et al.  Multi-scale modulation filtering in automatic detection of emotions in telephone speech , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Elisabeth André,et al.  Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[18]  Zhigang Deng,et al.  Emotion recognition based on phoneme classes , 2004, INTERSPEECH.

[19]  Valery A. Petrushin,et al.  EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS , 1999 .

[20]  Ralf Kompe,et al.  Emotional space improves emotion recognition , 2002, INTERSPEECH.

[21]  Thurid Vogt,et al.  Real-time automatic emotion recognition from speech , 2010 .

[22]  R. V. Darekar,et al.  Improving emotion detection with speech by enhanced approach , 2016, 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN).

[23]  Carlos Busso,et al.  Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection , 2009, IEEE Transactions on Audio, Speech, and Language Processing.