Machine printed handwritten text discrimination using Radon transform and SVM classifier

Discrimination of machine printed and handwritten text is deemed as major problem in the recognition of the mixed texts. In this paper, we address the problem of identifying each type by using the Radon transform and Support Vector Machines, which is conducted at three steps: preprocessing, feature generation and classification. New set of features is generated from each word using the Radon transform. Classification is used to distinguish printed text from handwritten. The proposed system is tested on IAM databases. The recognition rate of the proposed method is calculated to be over 98%.

[1]  S. Deans The Radon Transform and Some of Its Applications , 1983 .

[2]  Bidyut Baran Chaudhuri,et al.  Machine-printed and hand-written text lines identification , 2001, Pattern Recognit. Lett..

[3]  Hassiba Nemmour,et al.  Integrating class-dependant tangent vectors into SVMs for handwritten digit recognition , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[4]  Pinar Duygulu Sahin,et al.  Retrieval of Ottoman documents , 2006, MIR '06.

[5]  Laurent Wendling,et al.  A new shape descriptor defined on the Radon transform , 2006, Comput. Vis. Image Underst..

[6]  David S. Doermann,et al.  Machine printed text and handwriting identification in noisy document images , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Sergios Theodoridis,et al.  Pattern Recognition , 1998, IEEE Trans. Neural Networks.

[8]  Ching Y. Suen,et al.  Character Recognition Systems: A Guide for Students and Practitioners , 2007 .

[9]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[10]  Jinhong Katherine Guo,et al.  Separating handwritten material from machine printed text using hidden Markov models , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[11]  Jean-Michel Jolion,et al.  Extraction and recognition of artificial text in multimedia documents , 2003, Formal Pattern Analysis & Applications.

[12]  Norihiro Hagita,et al.  Automated entry system for printed documents , 1990, Pattern Recognit..

[13]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[14]  Hassiba Nemmour,et al.  Handwritten Digit Recognition Based on A Neural – SVM Combination , 2010 .

[15]  Zsolt Miklós Kovács-Vajna,et al.  A system for machine-written and hand-written character distinction , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[16]  K. R. Arvind,et al.  A Robust Two Level Classification Algorithm for Text Localization in Documents , 2007, ISVC.