A study of the effect of emotional state upon text-independent speaker identification

In this paper we evaluate the effect of the emotional state of a speaker when text-independent speaker identification is performed. The spectral features used for speaker recognition are the Mel-frequency cepstral coefficients, while for the training of the speaker models and testing the system the Gaussian Mixture Models are employed. The tests are performed on the Berlin emotional speech database which contains 10 different speakers recorded in different emotional situations: happy, angry, fear, bored, sad and neutral. The results show an important influence of the emotional state upon text-independent speaker identification. In the end we try to give a possible solution to this problem.

[1]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[2]  I. Shahin Speaker Recognition Systems in the Emotional Environment , 2008, 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications.

[3]  Emmanuel Dellandréa,et al.  Recognition of emotions in speech by a hierarchical approach , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[4]  Ling Guan,et al.  Recognizing Human Emotional State From Audiovisual Signals , 2008, IEEE Transactions on Multimedia.

[5]  Yi-Hsuan Yang,et al.  A Regression Approach to Music Emotion Recognition , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  Sergios Theodoridis,et al.  A dimensional approach to emotion recognition of speech from movies , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.