A novel hierarchical speech emotion recognition method based on improved DDAGSVM

In order to improve the recognition accuracy of speech emotion recognition, in this paper, a novel hierarchical method based on improved Decision Directed Acyclic Graph SVM (improved DDAGSVM) is proposed for speech emotion recognition. The improved DDAGSVM is constructed according to the confusion degrees of emotion pairs. In addition, a geodesic distance-based testing algorithm is proposed for the improved DDAGSVM to give the test samples differently distinguished many decision chances. Informative features and SVM optimized parameters used in each node of the improved DDAGSVM are gotten by Genetic Algorithm (GA) synchronously. On the Chinese Speech Emotion Database (CSED) and the Audio-Video Emotion Database (AVED) recorded by our workgroup, the recognition experiment results reveal that, compared with multi-SVM, binary decision tree and traditional DDAGSVM, the improved DDAGSVM has the higher recognition accuracy with few selected informative features and moderate time for 7 emotions.

[1]  Yongzhao Zhan,et al.  Speech Emotion Feature Selection Method Based on Contribution Analysis Algorithm of Neural Network , 2008 .

[2]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[3]  Fakhri Karray,et al.  Speech Emotion Recognition using Gaussian Mixture Vector Autoregressive Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[5]  Wei Wu,et al.  GMM Supervector Based SVM with Spectral Features for Speech Emotion Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Halis Altun,et al.  Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection , 2009, Expert Syst. Appl..

[7]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[8]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[9]  Ling Guan,et al.  Recognizing Human Emotional State From Audiovisual Signals , 2008, IEEE Transactions on Multimedia.

[10]  Zhihong Zeng,et al.  Audio–Visual Affective Expression Recognition Through Multistream Fused HMM , 2008, IEEE Transactions on Multimedia.

[11]  Say Wei Foo,et al.  Speech emotion recognition using hidden Markov models , 2003, Speech Commun..

[12]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[14]  Constantine Kotropoulos,et al.  Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition , 2008, Signal Process..

[15]  Ling Guan,et al.  A neural network approach for human emotion recognition in speech , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[16]  Cao Peng,et al.  Research and implementation of emotional feature extraction and recognition in speech signal , 2005 .

[17]  Ling Guan,et al.  Recognizing Human Emotional State From Audiovisual Signals* , 2008, IEEE Transactions on Multimedia.

[18]  Björn W. Schuller,et al.  Hidden Markov model-based speech emotion recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).