INTER-LINE DISTANCE ESTIMATION AND TEXT LINE EXTRACTION FOR UNCONSTRAINED ONLINE HANDWRITING

Methods for detecting and extracting whole text lines from unconstrained online handwritten text are described. The general approach is a ``bottom-up'' clustering of discrete strokes into small groups that are then merged into isolated lines of text. Initial clustering of strokes into groups is based on combined temporal and spatial stroke proximity. Spatial stroke proximity is gauged relative to estimated inter-line distance and mean character height. Two methods applicable to off-line or on-line data are described for estimating the inter-line distance: autocorrelation (self-convolution) of the Y-axis projection histogram, and a fitting function. Inter-line distance is accurately determined for 99% of all text pages. Text line extraction accuracy on letters (correspondence) is 98.7% and on tables is 94.9%.

[1]  William H. Press,et al.  Numerical Recipes in C, 2nd Edition , 1992 .

[2]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[3]  Michael Perrone,et al.  Writer dependent recognition of on-line unconstrained handwriting , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[4]  Hiroshi Maruyama,et al.  Real-time on-line unconstrained handwriting recognition using statistical methods , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  C. Scagliola,et al.  Generalised projections: a tool for cursive handwriting normalisation , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[7]  Zhixin Shi,et al.  A natural learning algorithm based on Hough transform for text lines extraction in handwritten documents , 1999 .

[8]  Nilo A Lindgren,et al.  Machine recognition of human language Part III - Cursive script recognition , 1965, IEEE Spectrum.

[9]  Elisabetta Bruzzone,et al.  An algorithm for extracting cursive text lines , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[10]  William H. Press,et al.  Numerical recipes in C , 2002 .