论文信息 - Study on Speaker-Independent Emotion Recognition from Speech on Real-World Data

Study on Speaker-Independent Emotion Recognition from Speech on Real-World Data

In the present work we report results from on-going research activity in the area of speaker-independent emotion recognition. Experimentations are performed towards examining the behavior of a detector of negative emotional states over non-acted/acted speech. Furthermore, a score-level fusion of two classifiers on utterance level is applied, in attempt to improve the performance of the emotion recognizer. Experimental results demonstrate significant differences on recognizing emotions on acted/real-world speech.

Theodoros Kostoulas | Nikos Fakotakis | Todor Ganchev

[1] Emiel Krahmer,et al. Real vs. acted emotional speech , 2006, INTERSPEECH.

[2] Björn W. Schuller,et al. Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles , 2005, INTERSPEECH.

[3] Laurence Devillers,et al. Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs , 2006, INTERSPEECH.

[4] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[5] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[6] Theodoros Kostoulas,et al. A Real-World Emotional Speech Corpus for Modern Greek , 2008, LREC.

[7] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8] Bin Yang,et al. The Relevance of Voice Quality Features in Speaker Independent Emotion Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9] Shrikanth S. Narayanan,et al. Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[10] Astrid Paeschke,et al. A database of German emotional speech , 2005, INTERSPEECH.

[11] P. Boersma. ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .