Handwritten Script Identification from a Bi-Script Document at Line Level using Gabor Filters

In a country like India where more number of scripts are in use, automatic identification of printed and handwritten script facilitates many important applications including sorting of document images and searching online archives of document images. In this paper, a Gabor feature based approach is presented to identify different Indian scripts from handwritten document images. Eight popular Indian scripts are considered here. Features are extracted from pre-processed images, consisting of portion of a line extracted manually from a handwritten document, using Gabor filters. Script classification performance is analyzed using the k-nearest neighbor classifier (KNN). Experiments are performed using five-fold cross validation method. Excellent recognition rate of 100% is achieved for data set size of 100 images

[1]  G. G. Rajput,et al.  Handwritten Script Recognition using DCT and Wavelet Features at Block Level , 2010 .

[2]  Patrick Kelly,et al.  Script and language identification for handwritten document images , 1999, International Journal on Document Analysis and Recognition.

[3]  M. C. Padma,et al.  Script Identification from Trilingual Documents using Profile Based Features , 2010, Int. J. Comput. Sci. Appl..

[4]  S. Abirami,et al.  A Survey of Script Identification techniques for Multi-Script Document Images , 2009 .

[5]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[6]  U. Pal,et al.  A system for word-wise handwritten script identification for Indian postal automation , 2004, Proceedings of the IEEE INDICON 2004. First India Annual Conference, 2004..

[7]  Bidyut Baran Chaudhuri,et al.  Automatic identification of English, Chinese, Arabic, Devnagari and Bangla script line , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[8]  Subhadip Basu,et al.  Word level Script Identification from Bangla and Devanagri Handwritten Texts mixed with Roman Script , 2010, ArXiv.

[9]  Bidyut Baran Chaudhuri,et al.  Script line separation from Indian multi-script documents , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[10]  Bidyut Baran Chaudhuri,et al.  Identification of different script lines from multi-script documents , 2002, Image Vis. Comput..

[11]  Bidyut B. Chaudhuri,et al.  Script Line Separation from Indian Multi-Script Documents , 2003 .

[12]  B. V. Dhandra,et al.  Offline Handwritten Script Identification in Document Images , 2010 .

[13]  Mohamed A. Ismail,et al.  Techniques for language identification for hybrid Arabic-English document images , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[14]  U. Pal,et al.  Multi-script line identification from Indian documents , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..