CHARACTERIZATION OF EMOTIONS USING THE DYNAMICS OF PROSODIC FEATURES

In this paper the dynamics of prosodic parameters are explored for recognizing the emotions from speech. The dynamics of prosodic parameters refer to local or fine variations in prosodic parameters with respect to time. The proposed dynamic features of prosody are represented by : (1) sequence of durations of syllables in the utterance (duration contour), (2) sequence of fundamental frequency values (pitch contour) and (3) sequence of frame energy values (energy contour). Indian Institute of Technology Kharagpur Simulated Emotion Speech Corpus (IITKGP-SESC) is used for analyzing the proposed prosodic features for recognizing the emotions [1]. The emotions considered in this work are anger, disgust, fear, happiness neutral and sadness. Support vector machines (SVM) are explored to discriminate the emotions using the proposed prosodic features. Emotion recognition performance is analyzed separately, using duration patterns of the sequence of syllables, pitch contours and energy contours, and their recognition performance is observed to be 64%, 67% and 53% respectively. Fusion techniques are explored at feature and score levels. The performance of the fusion-based emotion recognition systems is observed to be 69% and 74% for feature and score level fusions,respectively.

[1]  Ling Guan,et al.  An investigation of speech-based human emotion recognition , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[2]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  João Paulo Papa,et al.  Spoken emotion recognition through optimum-path forest classification using glottal features , 2010, Comput. Speech Lang..

[4]  Shashidhar G. Koolagudi,et al.  IITKGP-SESC: Speech Database for Emotion Analysis , 2009, IC3.

[5]  Dimitrios Ververidis,et al.  A State of the Art Review on Emotional Speech Databases , 2003 .

[6]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[7]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.