Speech Emotion Recognition and Intensity Estimation

In this paper, a system for speech emotion analysis is presented. On a corpus of over 1700 utterances from an individual, the feature vector stream is extracted for each utterance based on short time log frequency power coefficients (LFCC). Using the feature vector streams, we trained Hidden Markov Models (HMMs) to recognize seven basic categories emotions: neutral, happiness, anger, sadness, surprise, fear. Furthermore, the intensity of the basic emotion is divided into 3 levels. And we trained 18 sub-HMMs to identify the intensity of the recognized emotions. Experiment result shows that the emotion recognition rate and the estimation of intensity performed by our system are of good and convincing quality.

[1]  Harry Shum,et al.  Speech-driven cartoon animation with emotions , 2001, MULTIMEDIA '01.

[2]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[3]  Astrid Paeschke,et al.  Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements , 2000 .

[4]  Takeo Kanade,et al.  Subtly different facial expression recognition and expression intensity estimation , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[5]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[6]  Ryohei Nakatsu,et al.  Emotion recognition and its application to computer agents with spontaneous interactive capabilities , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[7]  Valery A. Petrushin,et al.  Emotion recognition in speech signal: experimental study, development, and application , 2000, INTERSPEECH.

[8]  Ryohei Nakatsu,et al.  Emotion recognition and its application to computer agents with spontaneous interactive capabilities , 1999, Creativity & Cognition.

[9]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[10]  Say Wei Foo,et al.  Classification of stress in speech using linear and nonlinear features , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[11]  Say Wei Foo,et al.  Speech emotion recognition using hidden Markov models , 2003, Speech Commun..

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.