论文信息 - State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software

State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software

This paper illustrates an evolution in state-of-the-art speaker verification by highlighting the contribution from newly developed techniques. Starting from a baseline system based on Gaussian mixture models that reached state-of-the-art performances during the NIST'04 SRE, final systems with new intersession compensation techniques show a relative gain of around 50%. This work highlights that a key element in recent improvements is still the classical maximum a posteriori (MAP) adaptation, while the latest compensation methods have a crucial impact on overall performances. Nuisance attribute projection (NAP) and factor analysis (FA) are examined and shown to provide significant improvements. For FA, a new symmetrical scoring (SFA) approach is proposed. We also show further improvement with an original combination between a support vector machine and SFA. This work is undertaken through the open-source ALIZE toolkit.

[1] William M. Campbell,et al. Fusing discriminative and generative methods for speaker recognition: experiments on switchboard and NFI/TNO field data , 2004, Odyssey.

[2] Douglas E. Sturim,et al. Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[3] D. A. Reynolds,et al. The effects of handset variability on speaker recognition performance: experiments on the Switchboard corpus , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[4] Frédéric Bimbot,et al. Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs , 2004, INTERSPEECH.

[5] Vincent Wan,et al. Speaker verification using support vector machines , 2003 .

[6] Patrick Kenny,et al. Factor analysis simplified [speaker verification applications] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7] Patrick Kenny,et al. Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.

[8] William M. Campbell,et al. Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9] J. Picone,et al. Speaker Verification using Support Vector Machines , 2006, Proceedings of the IEEE SoutheastCon 2006.

[10] Alvin F. Martin,et al. NIST speaker recognition evaluation chronicles , 2004, Odyssey.

[11] Samy Bengio,et al. A kernel trick for sequences applied to text-independent speaker verification systems , 2007, Pattern Recognit..

[12] William M. Campbell,et al. Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[13] Douglas A. Reynolds,et al. Channel robust speaker verification via feature mapping , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[14] Jean-François Bonastre,et al. UBM-GMM Driven Discriminative Approach for Speaker Verification , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[15] Douglas E. Sturim,et al. SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[16] Alvin F. Martin,et al. NIST Speaker Recognition Evaluation Chronicles - Part 2 , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[17] Driss Matrouf,et al. A straightforward and efficient implementation of the factor analysis model for speaker verification , 2007, INTERSPEECH.

[18] Gérard Chollet,et al. The ELISA Systems for the NIST"99 Evaluation in Speaker Detection and Tracking , 1999 .

[19] Jean-François Bonastre,et al. ALIZE, a free toolkit for speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20] Sridha Sridharan,et al. Experiments in Session Variability Modelling for Speaker Verification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[21] Sridha Sridharan,et al. Feature warping for robust speaker verification , 2001, Odyssey.

[22] Roland Auckenthaler,et al. Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[23] Larry P. Heck,et al. Handset-dependent background models for robust text-independent speaker recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[24] Pietro Laface,et al. Channel Factors Compensation in Model and Feature Domain for Speaker Recognition , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.