Morphology Based Handwritten Line Segmentation Using Foreground and Background Information

Currently text line segmentation is an important stage of research in historical document processing. Because of inter-line distance variability and base-line skew variability, line segmentation in unconstrained handwritten document is very difficult. The line segmentation task gets complicated, when overlapping or inter-penetration situation occurs between two consecutive text lines. In this paper we propose a method mostly based on morphological operation and run-length smearing algorithm (RLSA) to segment individual text lines from unconstrained handwritten document images. Here at first RLSA is applied to get individual word as a component. Next, the foreground portion of this smoothed image is eroded to get some seed components from the individual words of the document. Erosion is also done on background portions to find some boundary information of text lines. Finally, using the positional information of the seed components and the boundary information, the lines are segmented. We tested our scheme on images of five different scripts and we obtained encouraging results from

[1]  David Doermann,et al.  A New Algorithm for Detecting Text Line in Handwritten Documents , 2006 .

[2]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .

[3]  U. Pal,et al.  Segmentation of Bangla unconstrained handwritten text , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Abderrazak Zahour,et al.  Arabic hand-written text-line extraction , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[5]  Subhadip Basu,et al.  Text line extraction from multi-skewed handwritten documents , 2007, Pattern Recognit..

[6]  Umapada Pal,et al.  Multioriented and curved text lines extraction from Indian documents , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Umapada Pal,et al.  Touching numeral segmentation using water reservoir concept , 2003, Pattern Recognit. Lett..

[8]  Yousri Kessentini,et al.  Handwritten document segmentation using hidden Markov random fields , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[9]  Nobuyuki Otsu,et al.  ATlreshold Selection Method fromGray-Level Histograms , 1979 .

[10]  Bidyut Baran Chaudhuri,et al.  Indian script character recognition: a survey , 2004, Pattern Recognit..

[11]  Berrin A. Yanikoglu,et al.  Segmentation of off-line cursive handwriting using linear programming , 1998, Pattern Recognit..

[12]  Robert M. Haralick,et al.  A statistically based, highly accurate text-line segmentation method , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[13]  T. Yoshikawa,et al.  THE SEGMENTATION OF A TEXT LINE FOR A HANDWRITTEN UNCONSTRAINED DOCUMENT USING THINING ALGORITHM , 2004 .

[14]  Horst Bunke,et al.  A full English sentence database for off-line handwriting recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[15]  Bidyut Baran Chaudhuri,et al.  Skew Angle Detection of Digitized Indian Script Documents , 1997, IEEE Trans. Pattern Anal. Mach. Intell..