A Two Stage Recognition Scheme for Handwritten Tamil Characters

India is a multilingual multiscript country with more than 18 languages and 10 different major scripts. Not enough research work towards recognition of handwritten characters of these Indian scripts has been done. Tamil, an official as well as popular script of the southern part of India, Singapore, Malaysia, and Sri Lanka has a large character set which includes many compound characters. Only a few works towards handwriting recognition of this large character set has been reported in the literature. Recently, HP Labs India developed a database of handwritten Tamil characters. In the present paper, we describe an off-line recognition approach based on this database. The proposed method consists of two stages. In the first stage, we apply an unsupervised clustering method to create a smaller number of groups of handwritten Tamil character classes. In the second stage, we consider a supervised classification technique in each of these smaller groups for final recognition. The features considered in the two stages are different. The proposed two-stage recognition scheme provided acceptable classification accuracies on both the training and test sets of the present database.

[1]  Sanghamitra Mohanty Pattern Recognition in Alphabets of Oriya Language Using Kohonen Neural Network , 1998, Int. J. Pattern Recognit. Artif. Intell..

[2]  Fuad Rahman,et al.  Recognition of handwritten Bengali characters: a novel multistage approach , 2002, Pattern Recognit..

[3]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Santanu Chaudhury,et al.  Bengali alpha-numeric character recognition using curvature features , 1993, Pattern Recognit..

[5]  Malayappan Shridhar,et al.  On Recognition of Handwritten Bangla Characters , 2006, ICVGIP.

[6]  U Pal,et al.  A Complete System for Bangla Handwritten Numeral Recognition , 2006 .

[7]  Fumitaka Kimura,et al.  Handwritten ZIP code recognition using lexicon free word recognition algorithm , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[8]  Ujjwal Bhattacharya,et al.  An HMM Based Recognition Scheme for Handwritten Oriya Numerals , 2006, 9th International Conference on Information Technology (ICIT'06).

[9]  Ujjwal Bhattacharya,et al.  Recognition of Handwritten Bangla Vowel Modifiers , 2006 .

[10]  Nafiz Arica,et al.  An overview of character recognition focused on off-line handwriting , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[11]  Sitaram Bhagavathy,et al.  The independent components of characters are 'strokes' , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[12]  L. Ganesan,et al.  Recognition of printed and handwritten Tamil characters using fuzzy approach , 2005, Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA'05).

[13]  Anil K. Jain,et al.  Feature extraction methods for character recognition-A survey , 1996, Pattern Recognit..

[14]  Ujjwal Bhattacharya,et al.  Self-adaptive learning rates in backpropagation algorithm improve its function approximation performance , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[15]  M. B. Sukhaswami,et al.  Recognition of telugu characters using neural networks , 1995, Int. J. Neural Syst..

[16]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .