Combination of hidden Markov models with dynamic time warping for speech recognition

We combine hidden Markov models of various topologies and nearest neighbor classification techniques in an exponential modeling framework with a model selection algorithm to obtain significant error rate reductions on an isolated word digit recognition task. This work is a preliminary investigation of large scale modeling techniques to be applied to large vocabulary continuous speech recognition.

[1]  Ramesh A. Gopinath,et al.  Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Patrick Wambacq,et al.  Data driven example based continuous speech recognition , 2003, INTERSPEECH.

[3]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[4]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[5]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[6]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[7]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[8]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[9]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[10]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.