Detection of Negative Emotion Using Acoustic Cues and Machine Learning Algorithms in Moroccan Dialect

The speech signal provides rich information about the speaker’s emotional state. Therefore, recognition of the emotion in speech has become one of the research themes in the processing of speech and applications based on human-computer interaction. This article provides an experimental study and examines the detection of negative emotions such as fear and anger with regard to the neutral emotional state. The data set is collected from speeches recorded in the Moroccan Arabic dialect. Our aim is first to study the effects of emotion on the selected acoustic characteristics, namely the first four formants F1, F2, F3, F4, the fundamental frequency F0, Intensity, Number of pulses, Jitter and Shimmer and then compare our results to previous works. We also study the influence of phonemes and speaker gender on the relevance of these characteristics in the detection of emotion. To this aim, we performed classification tests using the WEKA software. We found that F0, Intensity, Number of Pulses have the best rates of recognition regardless speaker gender and phonemes. Moreover the second and third formant are the features that highlighted phoneme’s effect.

[1]  P. Ekman Expression and the Nature of Emotion , 1984 .

[2]  Ingo Siegert,et al.  Vowels formants analysis allows straightforward detection of high arousal emotions , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[3]  Ingo Siegert,et al.  Emotion Detection in HCI: From Speech Features to Emotion Space , 2013, IFAC HMS.

[4]  R Stibbard Vocal expressions of emotions in non-laboratory speech : an investigation of the Reading/Leeds Emotion in Speech Project annotation data. , 2001 .

[5]  Carlos Busso,et al.  Interrelation Between Speech and Facial Gestures in Emotional Utterances: A Single Subject Study , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[7]  Alex Pappachen James,et al.  Detection and Analysis of Emotion From Speech Signals , 2015, ArXiv.

[8]  L.C. De Silva,et al.  Detection of stress and emotion in speech using traditional and FFT based log energy features , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[9]  David Philippou-Hübner,et al.  The Performance of the Speaking Rate Parameter in Emotion Recognition from Speech , 2012, 2012 IEEE International Conference on Multimedia and Expo Workshops.

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Shashidhar G. Koolagudi,et al.  Emotion recognition from speech: a review , 2012, International Journal of Speech Technology.

[12]  Maja J. Mataric,et al.  A Framework for Automatic Human Emotion Classification Using Emotion Profiles , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  K. Scherer Vocal affect expression: a review and a model for future research. , 1986, Psychological bulletin.

[14]  P. Laukka,et al.  Communication of emotions in vocal expression and music performance: different channels, same code? , 2003, Psychological bulletin.

[15]  Diego H. Milone,et al.  Spoken emotion recognition using hierarchical classifiers , 2011, Comput. Speech Lang..

[16]  Elisabeth André,et al.  Comparing Feature Sets for Acted and Spontaneous Speech in View of Automatic Emotion Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[17]  Mayank Dave,et al.  Performance evaluation of sequentially combined heterogeneous feature streams for Hindi speech recognition system , 2013, Telecommun. Syst..

[18]  Chenchen Huang,et al.  A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM , 2014 .

[19]  Iain R. Murray,et al.  Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. , 1993, The Journal of the Acoustical Society of America.

[20]  K. Scherer,et al.  Acoustic profiles in vocal emotion expression. , 1996, Journal of personality and social psychology.

[21]  Oudeyer Pierre-Yves,et al.  The production and recognition of emotions in speech: features and algorithms , 2003 .

[22]  Björn Schuller,et al.  The Automatic Recognition of Emotions in Speech , 2011 .

[23]  Carlos Busso,et al.  Joint Analysis of the Emotional Fingerprint in the Face and Speech: A single subject study , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[24]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[25]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[26]  David Philippou-Hübner,et al.  Vowels Formants Analysis Allows Straightforward Detection of High Arousal Acted and Spontaneous Emotions , 2011, INTERSPEECH.

[27]  Enes Yuncu,et al.  Automatic Speech Emotion Recognition Using Auditory Models with Binary Decision Tree and SVM , 2014, 2014 22nd International Conference on Pattern Recognition.