Spoken affect classification using neural networks

This paper aims to build an affect recognition system by analysing acoustic speech signals. A database of 391 authentic emotional utterances was collected from 11 speakers. Two emotions, angry and neutral, were considered. Features relating to pitch, energy and rhythm were extracted and used as feature vectors for a neural network. Forward selection was employed to prune redundant and harmful inputs. Initial results show a classification rate of 86.1%.

[1]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[2]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[3]  Rosalind W. Picard Affective Computing , 1997 .

[4]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  K. Scherer,et al.  THE EFFECTS OF EMOTIONS ON VOICE QUALITY , 1999 .

[6]  David Talkin,et al.  A Robust Algorithm for Pitch Tracking ( RAPT ) , 2005 .

[7]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[8]  Elmar Nöth,et al.  From Emotion to Interaction: Lessons from Real Human-Machine-Dialogues , 2004, ADS.

[9]  Roddy Cowie,et al.  Automatic recognition of emotion from voice: a rough benchmark , 2000 .

[10]  Elmar Nöth,et al.  How to find trouble in communication , 2003, Speech Commun..

[11]  Valery A. Petrushin,et al.  EMOTION IN SPEECH: RECOGNITION AND APPLICATION TO CALL CENTERS , 1999 .

[12]  Alex Waibel,et al.  EMOTION-SENSITIVE HUMAN-COMPUTER INTERFACES , 2000 .

[13]  Klaus R. Scherer,et al.  Adding the affective dimension: a new look in speech analysis and synthesis , 1996, ICSLP.

[14]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[15]  Zhigang Deng,et al.  Emotion recognition based on phoneme classes , 2004, INTERSPEECH.

[16]  Andreas Stolcke,et al.  Prosody-based automatic detection of annoyance and frustration in human-computer dialog , 2002, INTERSPEECH.

[17]  Harry Shum,et al.  Emotion Detection from Speech to Enrich Multimedia Content , 2001, IEEE Pacific Rim Conference on Multimedia.

[18]  Elmar Nöth,et al.  Recognition of emotion in a realistic dialogue scenario , 2000, INTERSPEECH.