A comparison of string kernels and discrete hidden Markov models on a Spanish digit recognition task

String kernels have been introduced recently in an attempt to apply support vector machine (svm) classifiers to variable-length sequential data from a discrete alphabet. They have been used in the areas of text classification and bioinformatics, where notable results have been obtained. In the present paper string kernels are applied to a Spanish digit recognition task and their performance is compared to that of discrete hidden Markov models (dhmm). It is found that string kernels produce comparable results and may offer an alternative discriminative approach for certain speech recognition tasks.

[1]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[2]  Jason Weston,et al.  Mismatch String Kernels for SVM Protein Classification , 2002, NIPS.

[3]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[4]  John Shawe-Taylor,et al.  String Kernels, Fisher Kernels and Finite State Automata , 2002, NIPS.

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[7]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[8]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[9]  Mark J. F. Gales,et al.  Using SVMS and discriminative models for speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[11]  Colin Campbell,et al.  An introduction to kernel methods , 2001 .

[12]  Ramesh A. Gopinath,et al.  Enhancing GMM scores using SVM "hints" , 2001, INTERSPEECH.