Multistyle classification of speech under stress using feature subset selection based on genetic algorithms

The determination of an emotional state through speech increases the amount of information associated with a speaker. It is therefore important to be able to detect and identify a speaker's emotional state or state of stress. Various techniques are used in the literature to classify emotional/stressed states on the basis of speech, often using different speech feature vectors at the same time. This study proposes a new feature vector that will allow better classification of emotional/stressed states. The components of the feature vector are obtained from a feature subset selection procedure based on genetic algorithms. A good discrimination between neutral, angry, loud and Lombard states for the simulated domain of the Speech Under Simulated and Actual Stress (SUSAS) database and between neutral and stressed states for the actual domain of the SUSAS database is obtained.

[1]  John H. L. Hansen,et al.  Getting started with SUSAS: a speech under simulated and actual stress database , 1997, EUROSPEECH.

[2]  George N. Votsis,et al.  Emotion recognition in human-computer interaction , 2001, IEEE Signal Process. Mag..

[3]  Kenneth DeJong,et al.  Robust feature selection algorithms , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[4]  L.C. De Silva,et al.  Speech based emotion classification , 2001, Proceedings of IEEE Region 10 International Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).

[5]  Ryohei Nakatsu,et al.  Emotion Recognition in Speech Using Neural Networks , 2000, Neural Computing & Applications.

[6]  John H. L. Hansen,et al.  Nonlinear feature based classification of speech under stress , 2001, IEEE Trans. Speech Audio Process..

[7]  T. J. Thomas A finite element model of fluid flow in the vocal tract , 1986 .

[8]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[9]  John H. L. Hansen,et al.  A comparative study of traditional and newly proposed features for recognition of speech under stress , 2000, IEEE Trans. Speech Audio Process..

[10]  John H. L. Hansen,et al.  Feature analysis and neural network-based classification of speech under stress , 1996, IEEE Trans. Speech Audio Process..

[11]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[12]  Kwee-Bo Sim,et al.  Emotion recognition and acoustic analysis from speech signal , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[13]  Björn W. Schuller,et al.  Hidden Markov model-based speech emotion recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[14]  J. Marsden,et al.  A mathematical introduction to fluid mechanics , 1979 .

[15]  H. Teager Some observations on oral air flow during phonation , 1980 .

[16]  John H. L. Hansen,et al.  Classification of speech under stress using target driven features , 1996, Speech Commun..

[17]  F. Beritelli,et al.  A Genetic Algorithm Feature Selection Approach to Robust Classification between "Positive" and "Negative" Emotional States in Speakers , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[18]  Say Wei Foo,et al.  Classification of stress in speech using linear and nonlinear features , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..