Candidate search and elimination approach for Telugu OCR

Telugu is one of the prominent scripts in India and Asia. We propose an OCR system for Telugu based on the candidate search and elimination technique. The initial candidates for recognition are found by applying a zoning method on input glyphs. We propose cavities as a structural approach suited specifically for Telugu script, where cavity vectors are used to prune the candidates found by zoning. A final template matching stage using controlled nonlinear normalization is performed to conclude the search process. The search can be concluded, at any stage, whenever a unique candidate is found. A recognition accuracy of 97-98% was achieved on real images scanned from Telugu literature.

[1]  Atul Negi,et al.  Localization, extraction and recognition of text in Telugu document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2]  Atul Negi,et al.  An OCR system for Telugu , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[3]  R. Brown,et al.  The Fringe Distance Measure: An Easily Calculated Image Distance Measure with Recognition Results Comparable to Gaussian Blurring , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[4]  Seong-Whan Lee,et al.  Nonlinear shape normalization methods for the recognition of large-set handwritten characters , 1994, Pattern Recognit..