GMM kernel by Taylor series for speaker verification

Currently, approach of Gaussian Mixture Model combined with Support Vector Machine to text-independent speaker verification task has produced the stat-of-the-art performance. Many kernels have been reported for combining GMM and SVM. In this paper, we propose a novel kernel to represent the GMM distribution by Taylor expansion theorem and it’s regarded as the input of SVM. The utterance-specific GMM is represented as a combination of orders of Taylor series expansing at the the means of the Gaussian components. Here we extract the distribution information around the means of the Gaussian components in the GMM as we can naturally assume that each mean position indicates a feature cluster in the feature space. And then the kernel computes the emsemble distance between orders of Taylor series. Results of our new kernel on NIST speaker recognition evaluation (SRE) 2006 core task have been shown relative improvements of up to 7.1% and 11.7% in EER for male and female compared to K-L divergence based SVM system.

[1]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[2]  W. Rudin Principles of mathematical analysis , 1964 .

[3]  Mark J. F. Gales,et al.  Discriminative adaptation for speaker verification , 2006, INTERSPEECH.

[4]  Yanlu Xie,et al.  A New Hybrid GMM/SVM for Speaker Verification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Steve Renals,et al.  Speaker verification using sequence discriminant support vector machines , 2005, IEEE Transactions on Speech and Audio Processing.

[6]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7]  Mark J. F. Gales,et al.  Training Augmented Models Using SVMs , 2006, IEICE Trans. Inf. Syst..

[8]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[9]  Douglas E. Sturim,et al.  The 2004 MIT Lincoln Laboratory speaker recognition system , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  Qingsong Liu,et al.  SVM-Based Text-Independent Speaker Verification Using Derivative Kernel in the Reference GMM Space , 2008, 2008 International Symposiums on Information Processing.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.