Histogram Clustering and Hybrid Classifier for Handwritten Arabic Characters Recognition

Segmentation of the handwritten Arabic characters is still one of the most difficult problems to develop a reliable Arabic OCR. This paper presents a complete Arabic OCR system that uses histogram clustering method for the segmentation of the Arabic word. This method gives the ability to process different user styles, and manages the variability of pen strokes. Also, a new algorithm for separating overlapped characters was proposed to support the proposed technique for segmentation. The feature extraction process was based on a combination between the PCA network and characters geometric features. A classifier for hundred of Arabic character images was designed using a decision tree induction algorithm, and MLP network. A segmentation correctness of 96% was achieved while the recognition rate of the whole system was 91.5%.

[1]  Fuad Rahman,et al.  A multiexpert framework for character recognition: a novel application of Clifford networks , 2001, IEEE Trans. Neural Networks.

[2]  A. M. Obaid Arabic handwritten character recognition by neural nets , 1994 .

[3]  Katsuhiko Takahashi,et al.  A class-modular GLVQ ensemble with outlier learning for handwritten digit recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Volker Märgner,et al.  Baseline estimation for Arabic handwritten words , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[5]  Sabah S. Al-Fedaghi,et al.  Machine Recognition of Printed Arabic Text Utilizing Natural Language Morphology , 1991, Int. J. Man Mach. Stud..

[6]  Vedat Tavsanoglu,et al.  Multiscale handwritten character recognition using CNN image filters , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[7]  Adnan Amin,et al.  Hand-printed arabic character recognition system using an artificial network , 1996, Pattern Recognit..

[8]  A. O. M. Saleh A method of coding hand-written Arabic characters and its application to context-free grammar , 1994, Pattern Recognit. Lett..

[9]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[10]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[11]  Kwok-Wo Wong,et al.  Handwritten digit recognition using multilayer feedforward neural networks with periodic and monotonic activation functions , 2002, Object recognition supported by user interaction for service robots.

[12]  R. J. Green,et al.  Recognition of Handwritten Cursive Arabic Characters , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Mohamed A. Ismail,et al.  A graph-based segmentation and feature extraction framework for Arabic text recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[15]  Karim Faez,et al.  Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM , 2001, Pattern Recognit..

[16]  Theo Pavlidis,et al.  A vectorizer and feature extractor for document recognition , 1986 .

[17]  Andrew M. Gillies,et al.  Arabic Text Recognition System , 2007 .

[18]  Sherif Sami El-Dabi,et al.  Arabic character recognition system: A statistical approach for recognizing cursive typewritten text , 1990, Pattern Recognit..

[19]  Hussein Almuallim,et al.  A Method of Recognition of Arabic Cursive Handwriting , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  C. G. Leedham,et al.  Isolating individual handwritten characters , 1989 .