Databases for research on recognition of handwritten characters of Indian scripts

Three image databases of handwritten isolated numerals of three different Indian scripts namely Devnagari, Bangla and Oriya are described in this paper. Grayscale images of 22556 Devnagari numerals written by 1049 persons, 12938 Bangla numerals written by 556 persons and 5970 Oriya numerals written by 356 persons form the respective databases. These images were scanned from three different kinds of handwritten documents - postal mails, job application form and another set of forms specially designed by the collectors for the purpose. The only restriction imposed on the writers is to write each numeral within a rectangular box. These databases are free from the limitations that they are neither developed in laboratory environments nor they are non-uniformly distributed over different classes. Also, for comparison purposes, each database has been properly divided into respective training and test sets.

[1]  M. Berthod,et al.  Automatic recognition of handprinted characters—The state of the art , 1980, Proceedings of the IEEE.

[2]  J. Tsukumo,et al.  Classification of handprinted Chinese characters using nonlinear normalization and correlation methods , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[3]  Sargur N. Srihari,et al.  A System to Locate and Recognize ZIP Codes in Handwritten Addresses , 1989 .

[4]  Hiromitsu Yamada,et al.  A nonlinear normalization method for handprinted kanji character recognition - line density equalization , 1990, Pattern Recognit..

[5]  Santanu Chaudhury,et al.  Bengali alpha-numeric character recognition using curvature features , 1993, Pattern Recognit..

[6]  Seong-Whan Lee,et al.  Nonlinear shape normalization methods for the recognition of large-set handwritten characters , 1994, Pattern Recognit..

[7]  Adnan Amin,et al.  Hand printed Arabic character recognition system , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[8]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Bidyut Baran Chaudhuri,et al.  A complete printed Bangla OCR system , 1998, Pattern Recognit..

[11]  Anil K. Jain,et al.  Recognition of unconstrained online Devanagari characters , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[12]  Veena Bansal,et al.  Integrating knowledge sources in Devanagari text recognition system , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[13]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[14]  Bidyut Baran Chaudhuri,et al.  A Cascaded Scheme for Recognition of Handprinted Numerals , 2002, ICVGIP.

[15]  Bidyut Baran Chaudhuri,et al.  A Hybrid Scheme for Handprinted Numeral Recognition Based on a Self-Organizing Network and MLP Classifiers , 2002, Int. J. Pattern Recognit. Artif. Intell..

[16]  Fuad Rahman,et al.  Recognition of handwritten Bengali characters: a novel multistage approach , 2002, Pattern Recognit..

[17]  Santanu Chaudhury,et al.  Devnagari numeral recognition by combining decision of multiple connectionist classifiers , 2002 .

[18]  Bidyut Baran Chaudhuri,et al.  A majority voting scheme for multiresolution recognition of handprinted numerals , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[19]  Ching Y. Suen,et al.  Databases for recognition of handwritten Arabic cheques , 2003, Pattern Recognit..

[20]  M. E. Dehkordi,et al.  A two-stage approach for segmentation of handwritten Bangla word images , 2008 .