Study on Speaker-Independent Emotion Recognition from Speech on Real-World Data

In the present work we report results from on-going research activity in the area of speaker-independent emotion recognition. Experimentations are performed towards examining the behavior of a detector of negative emotional states over non-acted/acted speech. Furthermore, a score-level fusion of two classifiers on utterance level is applied, in attempt to improve the performance of the emotion recognizer. Experimental results demonstrate significant differences on recognizing emotions on acted/real-world speech.

[1]  Emiel Krahmer,et al.  Real vs. acted emotional speech , 2006, INTERSPEECH.

[2]  Björn W. Schuller,et al.  Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles , 2005, INTERSPEECH.

[3]  Laurence Devillers,et al.  Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs , 2006, INTERSPEECH.

[4]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[5]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[6]  Theodoros Kostoulas,et al.  A Real-World Emotional Speech Corpus for Modern Greek , 2008, LREC.

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  Bin Yang,et al.  The Relevance of Voice Quality Features in Speaker Independent Emotion Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[10]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[11]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .