Understanding HMM training for video gesture recognition

When developing a video gesture recognition system to recognise letters of the alphabet based on hidden Markov model (HMM) pattern recognition, we observed that by carefully selecting the model structure we could obtain greatly improved recognition performance. This led us to the questions: Why do some HMMs work so well for pattern recognition? Which factors affect the HMM training process? In an attempt to answer these fundamental questions of learning, we used simple triangle and square video gestures where good HMM structure can be deduced analytically from knowledge of the physical process. We then compared these analytic models to models estimated from Baum-Welch training on the video gestures. This paper shows that with appropriate constraints on model structure, Baum-Welch reestimation leads to good HMMs which are very similar to those obtained analytically. These results corroborate earlier work where we show that the LR banded HMM structure is remarkably effective in recognising video gestures when compared to fully-connected (ergodic) or LR HMM structures.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Jay J. Lee,et al.  Data-Driven Design of HMM Topology for Online Handwriting Recognition , 2001, Int. J. Pattern Recognit. Artif. Intell..

[3]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Brendan McCane,et al.  Components analysis of hidden Markov models in computer vision , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[6]  Nianjun Liu,et al.  Evaluation of HMM training algorithms for letter hand gesture recognition , 2003, Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795).

[7]  Brian C. Lovell,et al.  Comparing and evaluating HMM ensemble training algorithms using train and test and condition number criteria , 2003, Formal Pattern Analysis & Applications.