论文信息 - On NoMatchs, NoInputs and BargeIns: Do Non-Acoustic Features Support Anger Detection?

On NoMatchs, NoInputs and BargeIns: Do Non-Acoustic Features Support Anger Detection?

Most studies on speech-based emotion recognition are based on prosodic and acoustic features, only employing artificial acted corpora where the results cannot be generalized to telephone-based speech applications. In contrast, we present an approach based on utterances from 1,911 calls from a deployed telephone-based speech application, taking advantage of additional dialogue features, NLU features and ASR features that are incorporated into the emotion recognition process. Depending on the task, non-acoustic features add 2.3% in classification accuracy compared to using only acoustic features.

[1] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[2] Ingo Mierswa,et al. YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[3] Shrikanth S. Narayanan,et al. Combining acoustic and language information for emotion recognition , 2002, INTERSPEECH.

[4] Tim Polzehl,et al. Detecting real life anger , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5] Dilek Z. Hakkani-Tür,et al. Using context to improve emotion detection in spoken dialog systems , 2005, INTERSPEECH.

[6] W. Minker,et al. Handling Emotions in Human-Computer Dialogues , 2009 .

[7] Steven J. Simske,et al. Recognition of emotions in interactive voice response systems , 2003, INTERSPEECH.