The Relevance of Voice Quality Features in Speaker Independent Emotion Recognition

This paper investigates the classification of different emotional states using presodic and voice quality information. We want to exploit the usage of different phonation types within the production of emotions. Therefore, as features we use prosodic features, voice quality parameters, and different combinations of both types. We study how prosodic and voice quality features overlap or complement each other in the application of emotion recognition. The classification is speaker independent and uses a reduced subset of 8 features and a Bayesian classifier.

[1]  J. Laver The phonetic description of voice quality , 1980 .

[2]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[3]  Bin Yang,et al.  Robust Estimation of Voice Quality Parameters Under Realworld Disturbances , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[5]  Roddy Cowie,et al.  Automatic recognition of emotion from voice: a rough benchmark , 2000 .

[6]  H. Schlosberg Three dimensions of emotion. , 1954, Psychological review.

[7]  Björn Schuller,et al.  Emotion Recognition in the Noise Applying Large Acoustic Feature Sets , 2006 .

[8]  Josef Kittler,et al.  Floating search methods for feature selection with nonmonotonic criterion functions , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[9]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.