Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection

This paper deals with the strategies for feature selection and multi-class classification in the emotion detection problem. The aim is two-fold: to increase the effectiveness of four feature selection algorithms and to improve accuracy of multi-class classifiers for emotion detection problem under different frameworks and strategies. Although, a large amount of research has been conducted to determine the most informative features in emotion detection, it is still an open problem to identify reliably discriminating features. As it is believed that highly informative features are more critical factor than classifier itself, recent studies have been focused on identifying the features that contribute more to the classification problem. In this paper, in order to improve the performance of multi-class SVMs in emotion detection, 58 features extracted from recorded speech samples are processed in two new frameworks to boost the feature selection algorithms. Evaluation of the final feature sets validates that the frameworks are able to select more informative subset of the features in terms of class-separability. Also it is found that among four feature selection algorithms, a recently proposed one, LSBOUND, significantly outperforms the others. The accuracy rate obtained in the proposed framework is the highest achievement reported so far in the literature for the same dataset.

[1]  Nicu Sebe,et al.  Affective multimodal human-computer interaction , 2005, ACM Multimedia.

[2]  Halis Altun,et al.  Evalutation of Performance of KNN, MLP and RBF Classifiers in Emotion Detection Problem , 2007, 2007 IEEE 15th Signal Processing and Communications Applications.

[3]  John Shawe-Taylor,et al.  New feature selection frameworks in emotion recognition to evaluate the informative power of speech related features , 2007, 2007 9th International Symposium on Signal Processing and Its Applications.

[4]  Björn Schuller,et al.  Emotion recognition in the noise applying large acoustic feature sets , 2006, Speech Prosody 2006.

[5]  Rosalind W. Picard,et al.  Classical and novel discriminant features for affect recognition from speech , 2005, INTERSPEECH.

[6]  Constantine Kotropoulos,et al.  Emotional Speech Classification Using Gaussian Mixture Models and the Sequential Floating Forward Selection Algorithm , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[7]  P. Ekman,et al.  Handbook of methods in nonverbal behavior research , 1982 .

[8]  Mineichi Kudo,et al.  Non-parametric classifier-independent feature selection , 2006, Pattern Recognit..

[9]  H. Altun,et al.  Determining Efficiency of Speech Feature Groups in Emotion Detection , 2007, 2007 IEEE 15th Signal Processing and Communications Applications.

[10]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[11]  Marko Grobelnik,et al.  Feature selection using linear classifier weights: interaction with classification models , 2004, SIGIR '04.

[12]  Ling Guan,et al.  An investigation of speech-based human emotion recognition , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[13]  Xin Zhou,et al.  LS Bound based gene selection for DNA microarray data , 2005, Bioinform..

[14]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[15]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Marco Zaffalon,et al.  Robust Feature Selection by Mutual Information Distributions , 2002, UAI.

[17]  Masatoshi Ishikawa,et al.  Realizing Affect in Speech Classification in Real-Time , 2006, AAAI Fall Symposium: Aurally Informed Performance.

[18]  Constantine Kotropoulos,et al.  Automatic speech classification to five emotional states based on gender information , 2004, 2004 12th European Signal Processing Conference.

[19]  J. G. Taylor,et al.  Emotion recognition in human-computer interaction , 2005, Neural Networks.

[20]  Zhongzhe Xiao,et al.  Features extraction and selection for emotional speech classification , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[21]  K. Scherer,et al.  The New Handbook of Methods in Nonverbal Behavior Research , 2008 .

[22]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[23]  Juho Rousu,et al.  Learning Hierarchy via Embedding at Two-class Complexity , 2005 .

[24]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[25]  Steven J. Simske,et al.  Recognition of emotions in interactive voice response systems , 2003, INTERSPEECH.

[26]  Werner Verhelst,et al.  An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech , 2007, Speech Commun..

[27]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[28]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[29]  Halis Altun,et al.  New Frameworks to Boost Feature Selection Algorithms in Emotion Detection for Improved Human-Computer Interaction , 2007, BVAI.