A neural network-based approach for recognizing multi-font printed English characters

Abstract In this paper, we propose a method for recognizing English characters in different fonts. The proposed method based on neural network is resistant to font variant. When the samples in new fonts are added to the database, the accuracy of existing methods rapidly decreases and they are not resistant to font variant but to the accuracy of proposed method that almost stays constant and does not much decrease. A similarity measure neural network is used to identify characters and similarity measure compares the features of characters and the features of the indicators associated with the characters from A to Z obtained in the training stage. We use similarity measure instead of distance measure in SOM neural network because a person learns font-independent and a literate can read without knowing the font of the written note. In fact he/she measures similarity between the notes in new fonts and learned notes in his/her mind. Therefore, we use two samples for training the network as representative of all fonts such as default notes in man's mind. We could obtain 98.56% accuracy of recognizing a database that includes 24 different fonts in 11 different sizes.

[1]  P. Sablonnière,et al.  Interpolation by quadratic splines on triangles and squares , 1982 .

[2]  Rohit Prasad,et al.  Handwritten and Typewritten Text Identification and Recognition Using Hidden Markov Models , 2011, 2011 International Conference on Document Analysis and Recognition.

[3]  Teera Siriteerakul Mixed Thai-English Character Classification Based on Histogram of Oriented Gradient Feature , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[4]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[5]  Adel M. Alimi,et al.  Fractal-based system for Arabic/Latin, printed/handwritten script identification , 2008, 2008 19th International Conference on Pattern Recognition.

[6]  Yang Yang,et al.  English Character Recognition Based on Feature Combination , 2011 .

[7]  Adel M. Alimi,et al.  ICDAR 2011 - Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text , 2011, 2011 International Conference on Document Analysis and Recognition.

[8]  Chellapilla Patvardhan,et al.  A Novel Approach to Skeletonization for Multi-font OCR Applications , 2009, PReMI.

[9]  Manesh Kokare,et al.  Multi-font/size Kannada Vowels and Numerals Recognition Based on Modified Invariant Moments , 2010 .

[10]  Tetsuo Furukawa,et al.  Modular network SOM , 2009, Neural Networks.

[11]  Sanatan Sukhija,et al.  CRAMM: Character recognition aided by mathematical morphology , 2013, 2013 IEEE Second International Conference on Image Information Processing (ICIIP-2013).

[12]  Renu Dhir,et al.  Script Identification of Pre-segmented Multi-font Characters and Digits , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[13]  Edward H. Twizell,et al.  Weighted rational cubic spline interpolation and its application , 2000 .

[14]  Emin Kahya,et al.  A new unidimensional search method for optimization: Linear interpolation method , 2005, Appl. Math. Comput..

[15]  Hamid Hassanpour,et al.  A regression-based approach for measuring similarity in discrete signals , 2011 .

[16]  David A. Smith,et al.  Learning on the fly: a font-free approach toward multilingual OCR , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[17]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[18]  Fuad Rahman,et al.  Machine-printed character recognition revisited: re-application of recent advances in handwritten character recognition research , 1998, Image Vis. Comput..

[19]  G. R. Dunlop A rapid computational method for improvements to nearest neighbour interpolation , 1980 .