Comparing and evaluating HMM ensemble training algorithms using train and test and condition number criteria

AbstractHidden Markov Models have many applications in signal processing and pattern recognition, but their convergence-based training algorithms are known to suffer from over-sensitivity to the initial random model choice. This paper describes the boundary between regions in which ensemble learning is superior to Rabiner’s multiple-sequence Baum-Welch training method, and proposes techniques for determining the best method in any arbitrary situation. It also studies the suitability of the training methods using the condition number, a recently proposed diagnostic tool for testing the quality of the model. A new method for training Hidden Markov Models called the Viterbi Path Counting algorithm is introduced and is found to produce significantly better performance than current methods in a range of trials.

[1]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[2]  Christoph Neukirchen,et al.  Large vocabulary speech recognition with context dependent MMI-connectionist / HMM systems using the WSJ database , 1997, EUROSPEECH.

[3]  Brendan McCane,et al.  Components analysis of hidden Markov models in computer vision , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[4]  Brian C. Lovell,et al.  Improved Ensemble Training for Hidden Markov Models using Random Relative Node Permutations , 2003 .

[5]  Monson H. Hayes,et al.  Hidden Markov models for face recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Yangsheng Xu,et al.  Online, interactive learning of gestures for human/robot interfaces , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  Gerhard Rigoll,et al.  An Integrated Approach to Shape and Color-Based Image Retrieval of Rotated Objects Using Hidden Markov Models , 2001, Int. J. Pattern Recognit. Artif. Intell..

[9]  Brian C. Lovell,et al.  Improved Classification Using Hidden Markov Averaging From Multiple Observation Sequences , 2002 .

[10]  Brian C. Lovell,et al.  Improved estimation of hidden Markov model parameters from multiple observation sequences , 2002, Object recognition supported by user interaction for service robots.

[11]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[12]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[13]  Marc Parizeau,et al.  Training Hidden Markov Models with Multiple Observations-A Combinatorial Method , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Dale Schuurmans,et al.  Data perturbation for escaping local maxima in learning , 2002, AAAI/IAAI.

[15]  Jay J. Lee,et al.  Data-Driven Design of HMM Topology for Online Handwriting Recognition , 2001, Int. J. Pattern Recognit. Artif. Intell..

[16]  Aaron F. Bobick,et al.  Hidden Markov Models for Modeling and Recognizing Gesture Under Variation , 2001, Int. J. Pattern Recognit. Artif. Intell..

[17]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[18]  Terry Caelli,et al.  Shape Tracking and Production Using Hidden Markov Models , 2001, Int. J. Pattern Recognit. Artif. Intell..

[19]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.