Classification of printed Gujarati characters using som based k-Nearest Neighbor Classifier

This paper presents a method for combining Self Organizing Map (SOM) with k-Nearest Neighbor Classifier (k-NN) to device an elegant classification technique and applying it for classification of subset of printed Gujarati characters. Many researchers have employed many different models for the classification of printed/handwritten characters for number of different languages all over the globe; few of the widely used classifiers are Template Matching, Artificial Neural Network (ANN), Hidden Markov Model (HMM), and Support Vector Machine (SVM) etc. Our attempt is to use SOM based k-NN classifier for classification of subset of printed Gujarati characters. This approach does not require prior feature identification stage hence it is faster and more generalize compare to other approaches. A prototype system is implemented for the same and tested on sufficient dataset. Average accuracy of 82.36% is reported on test dataset.

[1]  Mandar Mitra,et al.  Automatic recognition of printed Oriya script , 2002 .

[2]  A Sharma Design and Implementation of Optical Character Recognition System to Recognize Gujarati Script using Template Matching , 2006 .

[3]  M M. Goswami,et al.  Candlestick Analysis based Short Term Prediction of Stock Price Fluctuation using SOM-CBR , 2009, 2009 IEEE International Advance Computing Conference.

[4]  Bidyut Baran Chaudhuri,et al.  An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi) , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[5]  Bidyut Baran Chaudhuri,et al.  Automatic recognition of printed Oriya script , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[6]  P. S. Sastry,et al.  A font and size-independent OCR system for printed Kannada documents using support vector machines , 2002 .

[7]  Pei-Chann Chang,et al.  A hybrid system combining self-organizing maps with case-based reasoning in wholesaler's new-release book forecasting , 2005, Expert Syst. Appl..

[8]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[9]  Ian Witten,et al.  Data Mining , 2000 .

[10]  Sameer Antani,et al.  Gujarati character recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[11]  C. Chandra Sekhar,et al.  Online Handwritten Character Recognition of Devanagari and Telugu Characters using Support Vector Machines , 2006 .

[12]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[13]  Umapada Pal,et al.  SVM Based Scheme for Thai and English Script Identification , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[14]  Sarah Jane Delany k-Nearest Neighbour Classifiers , 2007 .

[15]  Bidyut Baran Chaudhuri,et al.  Indian script character recognition: a survey , 2004, Pattern Recognit..

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[17]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..