Efficient score normalization for speaker recognition

Score normalization is an important component in most speech classification tasks including speaker recognition. State-of-the-art scoring approaches use both T-norm and Z-norm. This paper addresses the following goals: better understanding of existing score normalization methods, reducing the need for explicit score normalization, and improving the computational efficiency of score normalization. In addition, the importance of score normalization for speaker identification is demonstrated, and accuracy is improved considerably using various normalization techniques.

[1]  Douglas E. Sturim,et al.  Speaker adaptive cohort selection for Tnorm in text-independent speaker verification , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2]  David A. van Leeuwen,et al.  Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Hagai Aronowitz,et al.  Modeling intra-speaker variability for speaker recognition , 2005, INTERSPEECH.

[4]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[5]  Hagai Aronowitz Trainable speaker diarization , 2007, INTERSPEECH.

[6]  Hagai Aronowitz,et al.  Speaker indexing in audio archives using test utterance Gaussian mixture modeling , 2004, INTERSPEECH.

[7]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[8]  Frédéric Bimbot,et al.  A Monte-Carlo method for score normalization in Automatic Speaker Verification using Kullback-Leibler distances , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Hagai Aronowitz,et al.  Efficient Speaker Recognition Using Approximated Cross Entropy (ACE) , 2007, IEEE Transactions on Audio, Speech, and Language Processing.