A Sequential Pattern Classifier Based on Hidden Markov Kernel Machine and Its Application to Phoneme Classification

This paper describes a novel classifier for sequential data based on nonlinear classification derived from kernel methods. In the proposed method, kernel methods are used for enhancing the emission probability density functions (pdfs) of hidden Markov models (HMMs). Because the emission pdfs enhanced by kernel methods have sufficient nonlinear classification performance, mixture models such as Gaussian mixture models (GMMs), which might cause problems of overfitting and local optima, are not necessary in the proposed method. Unlike the methods used in earlier studies on sequential pattern classification using kernel methods, our method can be regarded as an extension of conventional HMMs, and therefore, it can completely model the transition of hidden states with the observed vectors. Therefore, our method can be applied to many applications developed with conventional HMMs, especially for speech recognition. In this paper, we carried out an isolated phoneme classification as a preliminary experiment in order to evaluate the efficiency of the proposed sequential pattern classifier. We confirmed that the proposed method achieved steady improvements as compared to conventional HMMs with Gaussian-mixture emission pdfs trained by the maximum likelihood and the maximum mutual information procedures.

[1]  Alex Acero,et al.  Hidden conditional random fields for phone classification , 2005, INTERSPEECH.

[2]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[3]  Shigeru Katagiri,et al.  String-level MCE for continuous phoneme recognition , 1997, EUROSPEECH.

[4]  Joseph Picone,et al.  Hybrid SVM/HMM architectures for speech recognition , 2000, INTERSPEECH.

[5]  Björn W. Schuller,et al.  Hidden Conditional Random Fields for Meeting Segmentation , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[6]  Daniel Povey,et al.  Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Daniel Povey,et al.  Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[8]  Tony Jebara,et al.  Combining kernels for classification , 2006 .

[9]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[10]  Joseph Picone,et al.  Applications of support vector machines to speech recognition , 2004, IEEE Transactions on Signal Processing.

[11]  Alex Pentland,et al.  Discriminative, generative and imitative learning , 2002 .

[12]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[13]  Gaël Richard,et al.  Alignment kernels for audio classification with application to music instrument recognition , 2008, 2008 16th European Signal Processing Conference.

[14]  Tomoko Matsui,et al.  A Kernel for Time Series Based on Global Alignments , 2006, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[15]  Nando de Freitas,et al.  Fast Krylov Methods for N-Body Learning , 2005, NIPS.

[16]  Shigeki Sagayama,et al.  Dynamic Time-Alignment Kernel in Support Vector Machine , 2001, NIPS.

[17]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Daniel P. W. Ellis,et al.  Multi-stream speech recognition: ready for prime time? , 1999, EUROSPEECH.

[19]  Frank K. Soong,et al.  An N-best candidates-based discriminative training for speech recognition applications , 1994, IEEE Trans. Speech Audio Process..

[20]  Masashi Sugiyama,et al.  Recent Advances and Trends in Large-Scale Kernel Methods , 2009, IEICE Trans. Inf. Syst..

[21]  M. Tahar Kechadi,et al.  A Hybrid HMM-SVM Method for Online Handwriting Symbol Recognition , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[22]  Lawrence K. Saul,et al.  Large Margin Hidden Markov Models for Automatic Speech Recognition , 2006, NIPS.

[23]  Hermann Ney,et al.  Comparison of discriminative training criteria and optimization methods for speech recognition , 2001, Speech Commun..