A Max Kernel For Text-Independent Speaker Verification Systems

In this paper, we present a principled SVM based speaker verification system. A general approach to compute two sequences of frames is developed that enables the use of any kernel at the frame level. An extension of this approach using the Max operator is then proposed. The new system is compared to state-of-the-art GMM and other SVM based systems found in the literature on the Polyvar database. The new system outperforms, most of the time, the other systems, statistically significantly.

[1]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[2]  William M. Campbell,et al.  Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[3]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[4]  Guillaume Gravier,et al.  Overview of the 2000-2001 ELISA Consortium research activities , 2001, Odyssey.

[5]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[6]  Gérard Chollet,et al.  Swiss French PolyPhone and PolyVar: telephone speech databases to model inter- and intra-speaker variability , 1996 .

[7]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[8]  Steve Renals,et al.  Speaker verification using sequence discriminant support vector machines , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  Samy Bengio,et al.  The Expected Performance Curve , 2003, ICML 2003.

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  Samy Bengio,et al.  A statistical significance test for person authentication , 2004, Odyssey.

[12]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[13]  Jean-Philippe Tarel,et al.  Non-Mercer Kernels for SVM Object Recognition , 2004, BMVC.

[14]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[15]  William M. Campbell,et al.  Generalized linear discriminant sequence kernels for speaker recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.