Machine Learning Approach for Emotional Speech Classification

Recognition of Emotion from speech is an extremely challenging task in current research. Using the reduced dimension method for feature extraction, Singular Value Decomposition (SVD) has proposed. Classification using Support Vector Machines (SVM) with SVD features shows an excellent result, which is the novelty of this work. The proposed features are evaluated for the task of emotion classification using simulation method. SVM has been designed as the classifier for classifying the unseen emotions in speech. It is shown that the classifier with such features outperforms the methods substantially. Using such features for classification outperforms the accuracy level approximately 90 % that leads towards automatic recognition.

[1]  Aurobinda Routray,et al.  Estimation of Autocorrelation Space for Classification of Bio-medical Signals , 2012, SEMCCO.

[2]  Mark B. Sandler,et al.  Classification of audio signals using statistical features on time and wavelet transform domains , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[3]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[4]  Chung-Hsien Wu,et al.  Emotion recognition using acoustic features and textual content , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[5]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[6]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[7]  Roddy Cowie,et al.  Automatic recognition of emotion from voice: a rough benchmark , 2000 .

[8]  I. Jolliffe Principal Component Analysis , 2002 .

[9]  Werner Verhelst,et al.  Automatic Classification of Expressiveness in Speech: A Multi-corpus Study , 2007, Speaker Classification.

[10]  Björn W. Schuller,et al.  Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge , 2011, Speech Commun..

[11]  Aurobinda Routray,et al.  Power quality disturbances classification using support vector machines with optimised time-frequency kernels , 2012 .

[12]  R. Bellman,et al.  V. Adaptive Control Processes , 1964 .

[13]  Carlos Busso,et al.  Emotion recognition using a hierarchical binary decision tree approach , 2011, Speech Commun..

[14]  Benoît Frénay,et al.  Using SVMs with randomised feature spaces: an extreme learning approach , 2010, ESANN.

[15]  Björn W. Schuller,et al.  The hinterland of emotions: Facing the open-microphone challenge , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[16]  S. V. Dudul,et al.  Human emotion recognition system using optimally designed SVM with different facial feature extraction techniques , 2008 .

[17]  Diane J. Litman,et al.  Recognizing emotions from student speech in tutoring dialogues , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[18]  Dale Groutage,et al.  New matrix decomposition based on transforming the basis sets of the singular value decomposition yields principal features for time-frequency distributions , 2000, SPIE Optics + Photonics.

[19]  Oh-Wook Kwon,et al.  EMOTION RECOGNITION BY SPEECH SIGNAL , 2003 .

[20]  D. Bennink,et al.  A new matrix decomposition based on optimum transformation of the singular value decomposition basis sets yields principal features of time-frequency distributions , 2000, Proceedings of the Tenth IEEE Workshop on Statistical Signal and Array Processing (Cat. No.00TH8496).

[21]  Fakhri Karray,et al.  Speech Emotion Recognition using Gaussian Mixture Vector Autoregressive Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.