Morphological Operations and Projection Profiles based Segmentation of Handwritten Kannada Document

Segmentation is an important task of any Optical Character Recognition (OCR) system. It separates the image text documents into lines, words and characters. The accuracy of OCR system mainly depends on the segmentation algorithm being used. Segmentation of handwritten text of some Indian languages like Kannada, Telugu, Assamese is difficult when compared with Latin based languages because of its structural complexity and increased character set. It contains vowels, consonants and compound characters. Some of the characters may overlap together. Despite several successful works in OCR all over the world, development of OCR tools in Indian languages is still an ongoing process. Character segmentation plays an important role in character recognition because incorrectly segmented characters are unlikely to be recognized correctly. In this paper, a segmentation scheme for segmenting handwritten Kannada scripts into lines, words and characters using morphological operations and projection profiles is proposed. The method was tested on totally unconstrained handwritten Kannada scripts, which pays more challenge and difficulty due to the complexity involved in the script. Usage of the morphology made extracting text lines efficient by an average extraction rate of 94.5% .Because of the varying inter and intra word gaps an average segmentation rate of 82.35% and 73.08% for words and characters respectively is obtained.

[1]  Ioannis Pratikakis,et al.  Text line and word segmentation of handwritten documents , 2009, Pattern Recognit..

[2]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Rajiv Kumar,et al.  Detection and segmentation of lines and words in Gurmukhi handwritten text , 2010, 2010 IEEE 2nd International Advance Computing Conference (IACC).

[4]  Noorzaily Mohamed Noor Off-line Handwriting Text Line Segmentation : A Review , 2008 .

[5]  Laurence Likforman-Sulem,et al.  A Hough based algorithm for extracting text lines in handwritten documents , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[6]  Subhadip Basu,et al.  Text line extraction from multi-skewed handwritten documents , 2007, Pattern Recognit..

[7]  Sargur N. Srihari,et al.  A statistical approach to line segmentation in handwritten documents , 2007, Electronic Imaging.

[8]  David Doermann,et al.  A New Algorithm for Detecting Text Line in Handwritten Documents , 2006 .

[9]  Georgios Louloudis,et al.  ICDAR 2009 Handwriting Segmentation Contest , 2009, ICDAR.

[10]  Atul Negi,et al.  Using Fringe Maps for Text Line Segmentation in Printed or Handwritten Document Images , 2010, 2010 Second Vaagdevi International Conference on Information Technology for Real World Problems.

[11]  K. Srikanta Murthy,et al.  Nearest Neighbor clustering based approach for line and character segmentation in epigraphical scripts , 2004 .

[12]  C. Weliwitage,et al.  Handwritten Document Offline Text Line Segmentation , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[13]  G. Lorette,et al.  Advances in Handwriting and Drawing: a multidisciplinary approach , 1994 .

[14]  Horst Bunke,et al.  Using Hidden Markov Models as a Tool for Handwritten Text Line Segmentation , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[15]  C. Halatsis,et al.  Line And Word Segmentation of Handwritten Documents , 2008 .

[16]  Apostolos Antonacopoulos,et al.  Handwriting Segmentation Contest , 2007, ICDAR.

[17]  Umapada Pal,et al.  Morphology Based Handwritten Line Segmentation Using Foreground and Background Information , 2008 .

[18]  Alireza Alaei,et al.  A Benchmark Kannada Handwritten Document Dataset and Its Segmentation , 2011, 2011 International Conference on Document Analysis and Recognition.

[19]  C. V. Jawahar,et al.  Learning Segmentation of Documents with Complex Scripts , 2006, ICVGIP.

[20]  Naresh Kumar Garg,et al.  A New Method for Line Segmentation of Handwritten Hindi Text , 2010, 2010 Seventh International Conference on Information Technology: New Generations.

[21]  V. N. Manjunath Aradhya,et al.  Text line segmentation of unconstrained handwritten Kannada script , 2011, ICCCS '11.

[22]  Basilios Gatos,et al.  A Novel Two Stage Evaluation Methodology for Word Segmentation Techniques , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[23]  Alireza Alaei,et al.  A new scheme for unconstrained handwritten text-line segmentation , 2011, Pattern Recognit..