Emotion recognition using multi-parameter speech feature classification

Speech emotion recognition is basically extraction and identification of emotion from a speech signal. Speech data, corresponding to various emotions as happiness, sadness and anger, was recorded from 30 subjects. A local database called Amritaemo was created with 300 samples of speech waveforms corresponding to each emotion. Based on the prosodic features: energy contour and pitch contour, and spectral features: cepstral coefficients, quefrency coefficients and formant frequencies, the speech data was classified into respective emotions. The supervised learning method was used for training and testing, and the two algorithms used were Hybrid Rule based K-mean clustering and multiclass Support Vector Machine (SVM) algorithms. The results of the study showed that, for optimized set of features, Hybrid-rule based K mean clustering gave better performance compared to Multi class SVM.

[1]  Wu Li,et al.  Speech Emotion Recognition in E-learning System Based on Affective Computing , 2007, Third International Conference on Natural Computation (ICNC 2007).

[2]  Yuan Jian,et al.  Application of Speech Emotion Recognition in Intelligent Household Robot , 2010, 2010 International Conference on Artificial Intelligence and Computational Intelligence.

[3]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[4]  P. Ekman Universals and cultural differences in facial expressions of emotion. , 1972 .

[5]  M. A. Matar,et al.  Speech based automatic lie detection , 1999, Proceedings of the Sixteenth National Radio Science Conference. NRSC'99 (IEEE Cat. No.99EX249).

[6]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[7]  R. Bos,et al.  AR spectral estimation by application of the Burg algorithm to irregularly sampled data , 2001, IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics (Cat. No.01CH 37188).

[8]  Václav Hlavác,et al.  Multi-class support vector machine , 2002, Object recognition supported by user interaction for service robots.

[9]  Björn W. Schuller,et al.  Hidden Markov model-based speech emotion recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[10]  Wendi B. Heinzelman,et al.  Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[11]  Mondher Frikha,et al.  Cepstrum vs. LPC: A Comparative Study for Speech Formant Frequencies Estimation , 2006 .

[12]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.