HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components

In off-line handwriting recognition, classifiers based on hidden Markov models (HMMs) have become very popular. However, while there exist well-established training algorithms which optimize the transition and output probabilities of a given HMM architecture, the architecture itself, and in particular the number of states, must be chosen "by hand". Also the number of training iterations and the output distributions need to be defined by the system designer. In this paper we examine several optimization strategies for an HMM classifier that works with continuous feature values. The proposed optimization strategies are evaluated in the context of a handwritten word recognition task.

[1]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[2]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[3]  Torsten Caesar,et al.  Sophisticated topology of hidden Markov models for cursive script recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Amlan Kundu,et al.  HANDWRITTEN WORD RECOGNITION USING HIDDEN MARKOV MODEL , 1997 .

[6]  R. Bakis Continuous speech recognition via centisecond acoustic states , 1976 .

[7]  J.-C. Simon,et al.  Off-line cursive word recognition , 1992, Proc. IEEE.

[8]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[9]  Gerhard Rigoll,et al.  Combination of multiple classifiers for handwritten word recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[10]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[11]  Horst Bunke,et al.  Automatic bankcheck processing , 1997 .

[12]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[13]  Horst Bunke,et al.  Hidden Markov model length optimization for handwriting recognition systems , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[14]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[15]  Horst Bunke,et al.  Handbook of Character Recognition and Document Image Analysis , 1997 .

[16]  Horst Bunke,et al.  Optimizing the number of states, training iterations and Gaussians in an HMM-based handwritten word recognizer , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[17]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[18]  Seong-Whan Lee,et al.  Advances in Handwriting Recognition , 1999, Series in Machine Perception and Artificial Intelligence.

[19]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[20]  Yves Normandin Optimal splitting of HMM Gaussian mixture components with MMIE training , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.