论文信息 - The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech

The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech

Emotion recognition from speech is an important field of research in human-machine-interfaces, and has various applications, for instance for call centers. In the proposed classifier system RASTA-PLP features (perceptual linear prediction) are extracted from the speech signals. The first step is to compute an universal background model (UBM) representing a general structure of the underlying feature space of speech signals. This UBM is modeled as a Gaussian mixture model (GMM). After computing the UBM the sequence of feature vectors extracted from the utterance is used to re-train the UBM. From this GMM the mean vectors are extracted and concatenated to the so-called GMM supervectors which are then applied to a support vector machine classifier. The overall system has been evaluated by using utterances from the public Berlin emotional database. Utilizing the proposed features a recognition rate of 79% (utterance based) has been achieved which is close to the performance of humans on this database.

Günther Palm | Friedhelm Schwenker | Stefan Scherer | Yasmine M. Magdi

[1] Astrid Paeschke,et al. A database of German emotional speech , 2005, INTERSPEECH.

[2] Douglas A. Reynolds,et al. A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[3] G. Palm,et al. Classifier fusion for emotion recognition from speech , 2007 .

[4] Ramin Shaghaghi Kandovan,et al. Optimization of speaker verification using adapted Gaussian mixture models for high quality databases , 2007 .

[5] H. Hermansky,et al. The modulation spectrum in the automatic recognition of speech , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[6] J. Platt. Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[7] Jeff A. Bilmes,et al. A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[8] Pierre-Yves Oudeyer,et al. The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[9] J. G. Taylor,et al. Emotion recognition in human-computer interaction , 2005, Neural Networks.

[10] Steven J. Simske,et al. Recognition of emotions in interactive voice response systems , 2003, INTERSPEECH.

[11] Malcolm Slaney,et al. Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12] Ryohei Nakatsu,et al. Emotion Recognition in Speech Using Neural Networks , 2000, Neural Computing & Applications.

[13] George N. Votsis,et al. Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[14] Lori Lamel,et al. Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[15] Misha Pavel,et al. On the relative importance of various components of the modulation spectrum for automatic speech recognition , 1999, Speech Commun..

[16] R. Plomp,et al. Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[17] Zhigang Deng,et al. Emotion recognition based on phoneme classes , 2004, INTERSPEECH.

[18] Hynek Hermansky,et al. Auditory Modeling in Automatic Recognition of Speech , 1996 .

[19] Douglas E. Sturim,et al. Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[20] Frank Dellaert,et al. Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[21] Oudeyer Pierre-Yves,et al. The production and recognition of emotions in speech: features and algorithms , 2003 .

[22] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[23] A. Atiya,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.