论文信息 - Handwritten Kannada Document Image Processing using Optical Character Recognition

Handwritten Kannada Document Image Processing using Optical Character Recognition

The objective of Optical Character Recognition (OCR) is automatic reading of optically sensed document text materials to translate human-readable characters to machinereadable codes. In Optical Character Recognition, the text lines in a document must be segmented properly before recognition. English Character Recognition (CR) has been extensively studied in the last half century and progressed to a level, sufficient to produce technology driven applications. But same is not the case for Indian languages which are complicated in terms of structure and computations. This is the motivation behind choosing OCR for Kannada language. A KSRTC bus pass application form written in Kannada is chosen for processing and recognition. The OCR system is devised to first segment the whole document into text lines, then to words and then to individual characters. These characters are then used to extract the necessary features and recognize those characters and classify them.

Mayur M Patil | Akkamahadevi R Hanni

[1] C. V. Jawahar,et al. Learning Segmentation of Documents with Complex Scripts , 2006, ICVGIP.

[2] Srikanta Pal,et al. Line and Word Segmentation Approach for Printed Documents , 2010 .

[3] Fei Yin,et al. Handwritten Chinese text line segmentation by clustering with distance metric learning , 2009, Pattern Recognit..

[4] Atul Negi,et al. Using Fringe Maps for Text Line Segmentation in Printed or Handwritten Document Images , 2010, 2010 Second Vaagdevi International Conference on Information Technology for Real World Problems.

[5] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[6] K. Srikanta Murthy,et al. Nearest Neighbor clustering based approach for line and character segmentation in epigraphical scripts , 2004 .

[7] Gurpreet Singh Lehal,et al. Segmentation of Horizontally Overlapping Lines in Printed Indian Scripts , 2007 .