An Enhanced Speech Emotion Recognition System Based on Discourse Information

There are certain correlation between two persons’ emotional states in communication, but none of previous work has focused on it. In this paper, a novel conversation database in Chinese was collected and an emotion interaction matrix was proposed to embody the discourse information in conversation. Based on discourse information, an enhanced speech emotion recognition system was presented to improve the recognition accuracy. Some modifications were performed on traditional KNN classification, which could reduce the interruption of noise. Experiment result shows that our system makes 3% – 5% relative improvement compared with the traditional method.

[1]  Andreas Stolcke,et al.  Prosody-based automatic detection of annoyance and frustration in human-computer dialog , 2002, INTERSPEECH.

[2]  Dilek Z. Hakkani-Tür,et al.  Using context to improve emotion detection in spoken dialog systems , 2005, INTERSPEECH.

[3]  Chun Chen,et al.  CHAD: A Chinese Affective Database , 2005, ACII.

[4]  Shrikanth Narayanan,et al.  Recognition of negative emotions from the speech signal , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[5]  Ioannis Pitas,et al.  Automatic emotional speech classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7]  Björn W. Schuller,et al.  Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  Elmar Nöth,et al.  How to find trouble in communication , 2003, Speech Commun..

[10]  Jiahong Yuan,et al.  The acoustic realization of anger, fear, joy and sadness in Chinese , 2002, INTERSPEECH.

[11]  Chung-Hsien Wu,et al.  Emotion recognition from textual input using an emotional semantic network , 2002, INTERSPEECH.

[12]  A. Mehrabian Communication without words , 1968 .

[13]  David Crystal,et al.  Prosodic Systems and Intonation in English , 1969 .

[14]  D. Crystal The English Tone Of Voice , 1975 .