Text Independent Emotion Recognition Using Spectral Features

This paper presents text independent emotion recognition from speech using mel frequency cepstral coefficients (MFCCs) along with their velocity and acceleration coefficients. In this work simulated Hindi emotion speech corpus, IITKGP-SEHSC is used for conducting the emotion recognition studies. The emotions considered are anger, disgust, fear, happy, neutral, sad, sarcastic, and surprise. Gaussian mixture models are used for developing emotion recognition models. Emotion recognition performance for text independent and text dependent cases are compared. Around 72% and 82% of emotion recognition rate is observed for text independent and dependent cases respectively.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[3]  K. S. Rao,et al.  IITKGP-SEHSC : Hindi Speech Corpus for Emotion Analysis , 2011, 2011 International Conference on Devices and Communications (ICDeCom).

[4]  Dimitrios Ververidis,et al.  A State of the Art Review on Emotional Speech Databases , 2003 .

[5]  Shashidhar G. Koolagudi,et al.  Two stage emotion recognition based on speaking rate , 2011, Int. J. Speech Technol..

[6]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[7]  Ryohei Nakatsu,et al.  Emotion recognition and its application to computer agents with spontaneous interactive capabilities , 2000, Knowl. Based Syst..

[8]  Shashidhar G. Koolagudi,et al.  Speech Emotion Recognition Using Segmental Level Prosodic Analysis , 2011, 2011 International Conference on Devices and Communications (ICDeCom).

[9]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[10]  R MurrayIain,et al.  Emotional stress in synthetic speech , 1996 .

[11]  John L. Arnott,et al.  Emotional stress in synthetic speech: Progress and future directions , 1996, Speech Commun..

[12]  Ryohei Nakatsu,et al.  Emotion Recognition in Speech Using Neural Networks , 2000, Neural Computing & Applications.