Training Hidden Markov Models with Multiple Observations-A Combinatorial Method

Hidden Markov models (HMM) are stochastic models capable of statistical learning and classification. They have been applied in speech recognition and handwriting recognition because of their great adaptability and versatility in handling sequential signals. On the other hand, as these models have a complex structure and also because the involved data sets usually contain uncertainty, it is difficult to analyze the multiple observation training problem without certain assumptions. For many years researchers have used the training equations of Levinson (1983) in speech and handwriting applications, simply assuming that all observations are independent of each other. This paper presents a formal treatment of HMM multiple observation training without imposing the above assumption. In this treatment, the multiple observation probability is expressed as a combination of individual observation probabilities without losing generality. This combinatorial method gives one more freedom in making different dependence-independence assumptions. By generalizing Baum's auxiliary function into this framework and building up an associated objective function using the Lagrange multiplier method, it is proven that the derived training equations guarantee the maximization of the objective function. Furthermore, we show that Levinson's training equations can be easily derived as a special case in this treatment.

[1]  Pierre Baldi,et al.  Smooth On-Line Learning Algorithms for Hidden Markov Models , 1994, Neural Computation.

[2]  Kai-Fu Lee,et al.  Automatic Speech Recognition , 1989 .

[3]  L. Baum,et al.  Growth transformations for functions on manifolds. , 1968 .

[4]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[5]  Makoto Iwayama,et al.  A New Algorithm for Automatic Configuration of Hidden Markov Models , 1993, ALT.

[6]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[8]  HuJianying,et al.  HMM Based On-Line Handwriting Recognition , 1996 .

[9]  David Nahamoo,et al.  A Fast Statistical Mixture Algorithm for On-Line Handwriting Recognition , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[11]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[12]  Paramvir Bahl,et al.  Recognition of handwritten word: first and second order hidden Markov model based approach , 1988, Proceedings CVPR '88: The Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[14]  Sung-Bae Cho,et al.  An HMM/MLP Architecture for Sequence Recognition , 1995, Neural Computation.

[15]  Stephen E. Levinson,et al.  A speaker-independent, syntax-directed, connected word recognition system based on hidden Markov models and level building , 1985, IEEE Trans. Acoust. Speech Signal Process..

[16]  Jianing Dai Robust estimation of HMM parameters using fuzzy vector quantization and Parzen's window , 1995, Pattern Recognit..

[17]  Paramvir Bahl,et al.  Recognition of handwritten word: First and second order hidden Markov model based approach , 1989, Pattern Recognit..

[18]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[19]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[20]  Raj Reddy,et al.  Automatic Speech Recognition: The Development of the Sphinx Recognition System , 1988 .

[21]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[22]  Ramjee Prasad,et al.  Hidden Markov models applied to on-line handwritten isolated character recognition , 1994, IEEE Trans. Image Process..

[23]  Torsten Caesar,et al.  Sophisticated topology of hidden Markov models for cursive script recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[24]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[25]  Christoph Neukirchen,et al.  A comparison between continuous and discrete density hidden Markov models for cursive handwriting recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[26]  Jianying Hu,et al.  HMM Based On-Line Handwriting Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..