Sound based human emotion recognition using MFCC & multiple SVM

Emotion recognition using human speech is one of the latest challenges in speech processing and Human Machine Interaction (HMI) for the purpose of addressing varied operational needs for the real world applications. Besides human facial expressions, speech has been proven to be one of the most valuable modalities for automatic recognition of human emotions. Speech is a spontaneous medium of perceiving emotions which provides in-depth. Here in this paper, we have used MFCC for extraction of features and Multiple Support Vector Machine (SVM) as a classifier. We have performed extensive experiment on happy, anger, sad, disgust, surprise and neutral emotion sound database. Performance analysis of multiple SVM revealed that non-linear kernel SVM achieved greater accuracy than linear SVM.

[1]  J. Montero,et al.  ANALYSIS AND MODELLING OF EMOTIONAL SPEECH IN SPANISH , 1999 .

[2]  Ioannis Pitas,et al.  Automatic emotional speech classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Harry Shum,et al.  Emotion Detection from Speech to Enrich Multimedia Content , 2001, IEEE Pacific Rim Conference on Multimedia.

[4]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..

[5]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[6]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[7]  Jody Kreiman,et al.  Perception of aperiodicity in pathological voice. , 2005, The Journal of the Acoustical Society of America.

[8]  Kuntoro Adi,et al.  Generalized Perceptual Features for Vocalization Analysis Across Multiple Species , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  D. F. Specht,et al.  Probabilistic neural networks for classification, mapping, or associative memory , 1988, IEEE 1988 International Conference on Neural Networks.

[10]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.