Conversion and Recognition of Handwritten Devnagari Character String into Printed Character String Using KNN

This paper presents a system for the conversion of handwritten string of Devnagari character to printed character string by using character segmentation approach. 11 different statistical features of segmented characters are extracted which are compared with features extracted from printed string of characters available in training data for cross validation purpose using Knearest neighborhood (kNN) algorithm. Use of handwritten string of Devnagari characters written in different styles and converting it into printed string makes the system more prone to real life application. System mainly works on segmentation of characters using bounding box, after segmentation, features are extracted which is compared with training feature set. We have analyzed our system with existing Devnagari handwritten character recognition systems. In given framework, we have focused on a creating database in different styles and recognizing them as printed characters. Keywords— K-Nearest Neighborhood Algorithm, Connected Components Labeling, Bounding Box, Statistical Feature, Feature Extraction Technique, Handwritten Devnagari string segmentation, Object extraction, Printed String of Characters.

[1]  Manish Kumar,et al.  Degraded Text Recognition of Gurmukhi Script , 2008 .

[2]  Chunheng Wang,et al.  Stroke Detector and Structure Based Models for Character Recognition: A Comparative Study , 2015, IEEE Transactions on Image Processing.

[3]  Umapada Pal,et al.  Multioriented and curved text lines extraction from Indian documents , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Bidyut Baran Chaudhuri,et al.  Skew Angle Detection of Digitized Indian Script Documents , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Alireza Alaei,et al.  A Comparative Study of Persian/Arabic Handwritten Character Recognition , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[6]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Latesh G. Malik,et al.  Review on Feature Extraction Technique for Handwritten Marathi Compound Character Recognition , 2013, 2013 6th International Conference on Emerging Trends in Engineering and Technology.

[8]  Hiroshi Sako,et al.  Performance evaluation of pattern classifiers for handwritten character recognition , 2002, International Journal on Document Analysis and Recognition.

[9]  David G. Stork,et al.  Pattern Classification , 1973 .

[10]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[11]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[12]  Cheng-Lin Liu,et al.  Classification and Learning Methods for Character Recognition: Advances and Remaining Problems , 2008, Machine Learning in Document Analysis and Recognition.

[13]  Jin Chen,et al.  Gabor features for offline Arabic handwriting recognition , 2010, DAS '10.

[14]  Bidyut Baran Chaudhuri,et al.  Indian script character recognition: a survey , 2004, Pattern Recognit..

[15]  Prasenjit Dey,et al.  HMM-based Indic handwritten word recognition using zone segmentation , 2016, Pattern Recognit..

[16]  David S. Doermann,et al.  Machine printed text and handwriting identification in noisy document images , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  M. C. Padma,et al.  Handwritten Kannada character recognition using wavelet transform and structural features , 2015, 2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT).

[18]  Dhanashree Joshi,et al.  Combination of Multiple Image Features along with KNN Classifier for Classification of Marathi Barakhadi , 2015, 2015 International Conference on Computing Communication Control and Automation.

[19]  Deepa Gupta,et al.  Improving OCR by effective pre-processing and segmentation for Devanagiri script:A quantified study , 2013 .

[20]  Alireza Alaei,et al.  A new scheme for unconstrained handwritten text-line segmentation , 2011, Pattern Recognit..

[21]  Volker Märgner,et al.  Comparison of Different Preprocessing and Feature Extraction Methods for Offline Recognition of Handwritten ArabicWords , 2007, ICDAR.

[22]  C. M. Velu,et al.  Automatic letter sorting for Indian Postal Address Recognition System based on PIN codes , 2010 .

[23]  M. N. Ayyaz,et al.  Handwritten Character Recognition Using Multiclass SVM Classification with Hybrid Feature Extraction , 2016 .

[24]  Mahantapas Kundu,et al.  Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition , 2008, 2008 IEEE Region 10 and the Third international Conference on Industrial and Information Systems.

[25]  Efstathios Stamatatos,et al.  Discrimination of machine-printed from handwritten text using simple structural characteristics , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[26]  Yasuaki Nakano,et al.  Segmentation methods for character recognition: from segmentation to document structure analysis , 1992, Proc. IEEE.

[27]  Fumitaka Kimura,et al.  Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier , 2006, ICVGIP.

[28]  Y. V. Joshi,et al.  Marathi numeral recognition using statistical distribution features , 2015, 2015 International Conference on Information Processing (ICIP).

[29]  Mahantapas Kundu,et al.  Study of Different Features on Handwritten Devnagari Character , 2009, 2009 Second International Conference on Emerging Trends in Engineering & Technology.

[30]  Cheng-Lin Liu,et al.  Handwritten digit recognition: benchmarking of state-of-the-art techniques , 2003, Pattern Recognit..

[31]  Bidyut Baran Chaudhuri,et al.  Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.