Using Hidden Markov Models as a Tool for Handwritten Text Line Segmentation

In this paper, the segmentation of off-line cursive handwritten text lines into individual words is investigated. The problem is considered as a text line recognition task, adapted to the characteristics of segmentation. That is, at a certain position of a text line, it has to be decided whether the considered position belongs to a letter of a word, or to a space between two words. Thus the text line needs to be recognized as a sequence of non-space and space characters. For this purpose, three different recognizers based on hidden Markov models are designed, and results of writer-dependent as well as writer-independent experiments are reported in the paper.

[1]  Uma Mahadevan,et al.  Gap metrics for word separation in handwritten lines , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[2]  Horst Bunke,et al.  Hidden Markov model length optimization for handwriting recognition systems , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[3]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[4]  Giovanni Seni,et al.  External word segmentation of off-line handwritten text lines , 1994, Pattern Recognit..

[5]  Ching Y. Suen,et al.  Word segmentation of printed text lines based on gap clustering and special symbol detection , 2002, Object recognition supported by user interaction for service robots.

[6]  Horst Bunke,et al.  Text line segmentation and word recognition in a system for general writer independent handwriting recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Ching Y. Suen,et al.  Word segmentation in handwritten Korean text lines based on gap clustering techniques , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[8]  Klaus D. Tönnies,et al.  Word segmentation of handwritten dates in historical documents by combining semantic a-priori-knowledge with local features , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[9]  Nikos Fakotakis,et al.  An unconstrained handwriting recognition system , 2002, International Journal on Document Analysis and Recognition.

[10]  Gyeonghwan Kim,et al.  Handwritten phrase recognition as applied to street name images , 1998, Pattern Recognit..

[11]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[12]  R. Manmatha,et al.  Scale Space Technique for Word Segmentation in Handwritten Documents , 1999, Scale-Space.

[13]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.