Word Segmentation and Baseline Detection in Handwritten Documents Using Isothetic Covers

A novel approach towards word segmentation and baseline detection in a handwritten document is proposed. It is based on certain structural properties of isothetic covers tightly enclosing the words in a handwritten document. For an appropriate grid size, the isothetic covers successfully segregates the words so that each cover corresponds to a particular word. By analyzing the horizontal chords of these covers, the corresponding baselines are extracted. The method is fast, robust, and efficient by dint of its traversal strategy along the word boundaries in a combinatorial manner and usage of limited operations strictly in the integer domain. Some results on several Bengali and English handwritings have been given to demonstrate its strength and elegance.

[1]  Sargur N. Srihari,et al.  Integration of hand-written address interpretation technology into the United States Postal Service Remote Computer Reader system , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[2]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jin Wang,et al.  Segmentation of merged characters by neural networks and shortest path , 1994, Pattern Recognit..

[4]  Jianchang Mao,et al.  Automated forms-processing software and services , 1996, IBM J. Res. Dev..

[5]  Ching Y. Suen,et al.  Recognition of legal amounts on bank cheques , 1998, Pattern Analysis and Applications.

[6]  Prasun Sinha,et al.  A system for cursive handwritten address recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[7]  B. B. Chaudhuri,et al.  Curvelet-Based Multi SVM Recognizer for Offline Handwritten Bangla: A Major Indian Script , 2007 .

[8]  van Galen Gp,et al.  Neuromotor control in handwriting and drawing: introduction and overview. , 1998 .

[9]  Réjean Plamondon,et al.  Computer processing of handwriting , 1990 .

[10]  Paul D. Gader,et al.  Handwritten Word Recognition Using Segmentation-Free Hidden Markov Modeling and Segmentation-Based Dynamic Programming Techniques , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Seong-Whan Lee,et al.  A new methodology for gray-scale character segmentation and recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[13]  Emmanuel Augustin,et al.  A2iA Check Reader: a family of bank check recognition systems , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[14]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[15]  Bidyut Baran Chaudhuri,et al.  2009 10th International Conference on Document Analysis and Recognition Handwritten Text Line Identification In Indian Scripts , 2022 .

[16]  Bidyut Baran Chaudhuri,et al.  Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Sargur N. Srihari,et al.  A system to read names and addresses on tax forms , 1996 .

[18]  Bidyut Baran Chaudhuri,et al.  Online handwritten Bangla character recognition using HMM , 2008, 2008 19th International Conference on Pattern Recognition.

[19]  Partha Bhowmick,et al.  Construction of isothetic covers of a digital object: A combinatorial approach , 2010, J. Vis. Commun. Image Represent..

[20]  Bidyut Baran Chaudhuri,et al.  Automation of Indian Postal Documents Written in Bangla and English , 2009, Int. J. Pattern Recognit. Artif. Intell..