A generalised derivative kernel for speaker verification

An important aspect of SVM-based speaker verification systems is the choice of dynamic kernel. For the GLDS kernel, a static kernel is used to map each observation into a higher order feature space. Features are then obtained by taking a simple average over all frames. Derivative kernels, such as the Fisher kernel, use a generative model as a principled way of extracting a fixed set of features from each utterance. However, the model and features are defined using the original observations. Here, a dynamic kernel is described that combines these two approaches. In general, it is not possible to explicitly train a model in the feature space associated with a static kernel. However, by using a suitable metric with approximate component posteriors, this form of dynamic kernel can be computed. This kernel generalises the GLDS and derivative kernel as special cases and is also closely related to parametric kernels such as the GMMsupervector kernel. Preliminary results using this kernel are presented on the 2002 NIST SRE dataset.

[1]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[2]  William M. Campbell,et al.  Generalized linear discriminant sequence kernels for speaker recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Mark J. F. Gales,et al.  Multiple kernel learning for speaker verification , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Samy Bengio,et al.  A kernel trick for sequences applied to text-independent speaker verification systems , 2007, Pattern Recognit..

[5]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[6]  Mark J. F. Gales,et al.  Derivative and parametric kernels for speaker verification , 2007, INTERSPEECH.

[7]  Francis R. Bach,et al.  Feature Space Mahalanobis Sequence Kernels: Application to SVM Speaker Verification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Mark J. F. Gales,et al.  Training Augmented Models Using SVMs , 2006, IEICE Trans. Inf. Syst..

[9]  Steve Renals,et al.  Speaker verification using sequence discriminant support vector machines , 2005, IEEE Transactions on Speech and Audio Processing.

[10]  Douglas E. Sturim,et al.  SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11]  Andreas Stolcke,et al.  MLLR transforms as features in speaker recognition , 2005, INTERSPEECH.