On-line speaker adaptation based emotion recognition using incremental emotional information

This paper proposes a new Speech Emotion Recognition (SER) framework. Compared to the speaker-independent emotion models, speaker-adapted models constructed by using a speaker's emotional speech data can represent the speaker's emotional characteristics more precisely, thus improving SER accuracy. However, it is hard to collect a sufficient amount of personal emotional data at once. For this reason, we propose an MLLR-based online speaker adaptation technique using accumulated personal data. Compared to speech models, it is relatively difficult to construct reliable emotion models applicable to MLLR due to the domain-oriented characteristics. Thus, we modify the conventional MLLR procedure by using selective label refinement, which categorizes newly accumulated adaptation data into discriminative and non-discriminative data, and only refines the labels of the discriminative data. On SER experiments based on an LDC emotion corpus, our approach exhibited superior performance when compared to conventional adaptation techniques as well as the speaker-independent model framework.1

[1]  Michiel Bacchiani,et al.  Confidence scores for acoustic model adaptation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Constantine Kotropoulos,et al.  Emotional speech recognition: Resources, features, and methods , 2006, Speech Commun..

[3]  Shrikanth S. Narayanan,et al.  Primitives-based evaluation and estimation of emotions in speech , 2007, Speech Commun..

[4]  Joaquín González-Rodríguez,et al.  Speaker dependent emotion recognition using prosodic supervectors , 2009, INTERSPEECH.

[5]  Tasos Anastasakos,et al.  The use of confidence measures in unsupervised adaptation of speech recognizers , 1998, ICSLP.

[6]  Jeong-Sik Park,et al.  Feature vector classification based speech emotion recognition for service robots , 2009, IEEE Transactions on Consumer Electronics.

[7]  Mark J. F. Gales,et al.  Iterative unsupervised adaptation using maximum likelihood linear regression , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.