Decision tree SVM model with Fisher feature selection for speech emotion recognition

The overall recognition rate will reduce due to the increase of emotional confusion in multiple speech emotion recognition. To solve the problem, we propose a speech emotion recognition method based on the decision tree support vector machine (SVM) model with Fisher feature selection. At the stage of feature selection, Fisher criterion is used to filter out the feature parameters of higher distinguish ability. At the emotion classification stage, an algorithm is proposed to determine the structure of decision tree. The decision tree SVM can realize the two-step classification of the first rough classification and the fine classification. Thus the redundant parameters are eliminated and the performance of emotion recognition is improved. In this method, the decision tree SVM framework is firstly established by calculating the confusion degree of emotion, and then the features with higher distinguish ability are selected for each SVM of the decision tree according to Fisher criterion. Finally, speech emotion recognition is realized based on this model. The decision tree SVM with Fisher feature selection on CASIA Chinese emotion speech corpus and Berlin speech corpus are constructed to validate the effectiveness of our framework. The experimental results show that the average emotion recognition rate based on the proposed method is 9% higher than traditional SVM classification method on CASIA, and 8.26% higher on Berlin speech corpus. It is verified that the proposed method can effectively reduce the emotional confusion and improve the emotion recognition rate.

[1]  Cynthia Breazeal,et al.  Recognition of Affective Communicative Intent in Robot-Directed Speech , 2002, Auton. Robots.

[2]  Carlos Busso,et al.  Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Zhao,et al.  Recognition of practical speech emotion using improved shuffled frog leaping algorithm , 2014 .

[4]  Ibrahiem M. M. El Emary,et al.  Speech emotion recognition approaches in human computer interaction , 2013, Telecommun. Syst..

[5]  Zhen-Tao Liu,et al.  Speaker-independent speech emotion recognition based on random forest feature selection algorithm , 2017, 2017 36th Chinese Control Conference (CCC).

[6]  Bin Yang,et al.  The Relevance of Voice Quality Features in Speaker Independent Emotion Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  Bin Yang,et al.  Emotion recognition from speech signals using new harmony features , 2010, Signal Process..

[8]  Say Wei Foo,et al.  Speech emotion recognition using hidden Markov models , 2003, Speech Commun..

[9]  Björn W. Schuller,et al.  Learning with synthesized speech for automatic emotion recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Carlos Busso,et al.  Emotion recognition using a hierarchical binary decision tree approach , 2011, Speech Commun..

[11]  Lijiang Chen,et al.  Multi-level Speech Emotion Recognition Based on HMM and ANN , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[12]  Chellu Chandra Sekhar,et al.  Combination of generative models and SVM based classifier for speech emotion recognition , 2009, 2009 International Joint Conference on Neural Networks.

[13]  Peng Song,et al.  Cross-corpus speech emotion recognition using transfer semi-supervised discriminant analysis , 2016, 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP).

[14]  Lijiang Chen,et al.  Speaker independent emotion recognition based on SVM/HMMS fusion system , 2008, 2008 International Conference on Audio, Language and Image Processing.

[15]  Ning An,et al.  Speech Emotion Recognition Using Fourier Parameters , 2015, IEEE Transactions on Affective Computing.

[16]  Chloé Clavel,et al.  Fear-type emotion recognition for future audio-based surveillance systems , 2008, Speech Commun..

[17]  Emmanuel Dellandréa,et al.  Multi-stage classification of emotional speech motivated by a dimensional emotion model , 2009, Multimedia Tools and Applications.

[18]  Zhao Li Speech Emotion Recognition Based on Decomposition of Feature Space and Information Fusion , 2010 .

[19]  Neda Faraji,et al.  Speech emotion classification via a modified Gaussian Mixture Model approach , 2014, 7'th International Symposium on Telecommunications (IST'2014).

[20]  Li Yue Speech emotion recognition using stacked generative and discriminative hybrid models , 2013 .

[21]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[22]  M. S. Sinith,et al.  Emotion recognition from audio signals using Support Vector Machine , 2015, 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS).

[23]  Rohit Sinha,et al.  Speech based Emotion Recognition based on hierarchical decision tree with SVM, BLG and SVR classifiers , 2013, 2013 National Conference on Communications (NCC).

[24]  Ashwani Kumar,et al.  Parameter optimisation using genetic algorithm for support vector machine-based price-forecasting model in National electricity market , 2010 .