论文信息 - Within-class covariance normalization for SVM-based speaker recognition

Within-class covariance normalization for SVM-based speaker recognition

This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our approach involves using principal component analysis (PCA) to split the original feature space into two subspaces: a low-dimensional “PCA space” and a high-dimensional “PCA-complement space.” After performing WCCN in the PCA space, we concatenate the resulting feature vectors with a weighted version of their PCAcomplements. When applied to a state-of-the-art MLLR-SVM speaker recognition system, this approach achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over our previous baseline. We also achieve substantial improvements over an MLLR-SVM system that performs WCCN in the PCA space but discards the PCA-complement.

[1] Andreas Stolcke,et al. MLLR transforms as features in speaker recognition , 2005, INTERSPEECH.

[2] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.

[3] Roland Auckenthaler,et al. Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[4] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[7] William M. Campbell,et al. A Sequence Kernel and its Application to Speaker Recognition , 2001, NIPS.

[8] William M. Campbell,et al. Advances in channel compensation for SVM speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[9] Andreas Stolcke,et al. Improvements in MLLR-Transform-based Speaker Recognition , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[10] Andreas Stolcke,et al. Generalized Linear Kernels for One-Versus-All Classification: Application to Speaker Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11] William M. Campbell,et al. Phonetic Speaker Recognition with Support Vector Machines , 2003, NIPS.

[12] William M. Campbell,et al. Channel compensation for SVM speaker recognition , 2004, Odyssey.

[13] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .

[14] S.S. Kajarekar. Four weightings and a fusion: a cepstral-SVM system for speaker recognition , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..