Separating Indic Scripts with matra for Effective Handwritten Script Identification in Multi-Script Documents

We present a novel approach for separating Indic scripts with ‘matra’, which is used as a precursor to advance and/or ease subsequent handwritten script identification in multi-script documents. In our study, among state-of-the-art features and classifiers, an optimized fractal geometry analysis and random forest are found to be the best performer to distinguish scripts with ‘matra’ from their counterparts. For validation, a total of 1204 document images are used, where two different scripts with ‘matra’: Bangla and Devanagari are considered as positive samples and the other two different scripts: Roman and Urdu are considered as negative samples. With this precursor, an overall script identification performance can be advanced by more than 5.13% in accuracy and 1.17 times faster in processing time as compared to conventional system.

[1]  Bidyut Baran Chaudhuri,et al.  Automation of Indian Postal Documents Written in Bangla and English , 2009, Int. J. Pattern Recognit. Artif. Intell..

[2]  G. G. Rajput,et al.  Fourier Descriptor based Isolated Marathi Handwritten Numeral Recognition , 2010 .

[3]  Sk Md Obaidullah,et al.  A System for Handwritten Script Identification From Indian Document , 2013 .

[4]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yi Li,et al.  Language identification for handwritten document images using a shape codebook , 2009, Pattern Recognit..

[6]  Gertjan J. Burghouts Soft-Assignment Random-forest with an Application to Discriminative Representation of Human Actions in Videos , 2013, Int. J. Pattern Recognit. Artif. Intell..

[7]  Jinwen Ma A Neural Network Approach to Real-Time Pattern Recognition , 2001, Int. J. Pattern Recognit. Artif. Intell..

[8]  Patrick Kelly,et al.  Script and language identification for handwritten document images , 1999, International Journal on Document Analysis and Recognition.

[9]  Debashis Ghosh,et al.  Script Recognition—A Review , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Alireza Alaei,et al.  Dataset and Ground Truth for Handwritten Text in Four Different Scripts , 2012, Int. J. Pattern Recognit. Artif. Intell..