Emotional speaker verification with linear adaptation

Speaker verification suffers from significant performance degradation on emotional speech. We present an adaptation approach based on maximum likelihood linear regression (MLLR) and its feature-space variant, CMLLR. Our preliminary experiments demonstrate that this approach leads to considerable performance improvement, particularly with CMLLR (about 10% relative EER reduction in average). We also find that the performance gain can be significantly increased with a large set of training data for the transform estimation.

[1]  Klaus R. Scherer,et al.  Acoustic correlates of task load and stress , 2002, INTERSPEECH.

[2]  Klaus R. Scherer,et al.  Can automatic speaker verification be improved by training the algorithms on emotional speech? , 2000, INTERSPEECH.

[3]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[4]  Catherine I. Watson,et al.  Some acoustic characteristics of emotion , 1998, ICSLP.

[5]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[6]  Elisabeth Zetterholm Prosody and voice quality in the expression of emotions , 1998, ICSLP.

[7]  I. Shahin Speaker Identification in Emotional Environments , 2010 .

[8]  Andreas Stolcke,et al.  Speaker Recognition With Session Variability Normalization Based on MLLR Adaptation Transforms , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Yingchun Yang,et al.  Learning polynomial function based neutral-emotion GMM transformation for emotional speaker recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[10]  Thomas Fang Zheng,et al.  Study on speaker verification on emotional speech , 2006, INTERSPEECH.

[11]  Zhaohui Wu,et al.  Improving Speaker Recognition by Training on Emotion-Added Models , 2005, ACII.

[12]  K. Scherer,et al.  THE EFFECTS OF EMOTIONS ON VOICE QUALITY , 1999 .

[13]  Mark J. F. Gales,et al.  Mean and variance adaptation within the MLLR framework , 1996, Comput. Speech Lang..