A Recognition-Based Approach to Segmenting Arabic Handwritten Text

Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcoming, which is over-segmentation. The aim of segmentation is to produce the letters (segments) of a handwritten word. When a resulting letter (segment) is made of more than one piece (stroke) instead of one, this is called over-segmentation. Our objective is to overcome this problem by using an Artificial Neural Networks (ANN) to verify the resulting segment. We propose a set of heuristic-based rules to assemble strokes in order to report the precise segmented letters. Preprocessing phases that include normalization and feature extraction are required as a prerequisite step for the ANN system for recognition and verification. In our previous work [1], we did achieve a segmentation success rate of 86% but without recognition. In this work, our experimental results confirmed a segmentation success rate of no less than 95%.

[1]  Mohammad S. Khorsheed,et al.  Off-Line Arabic Character Recognition – A Review , 2002, Pattern Analysis & Applications.

[2]  Paul Wintz,et al.  Instructor's manual for digital image processing , 1987 .

[3]  W. F. Clocksin,et al.  Structural Features of Cursive Arabic Script , 1999, BMVC.

[4]  Robert M. Haralick,et al.  Segmentation-free word recognition with application to Arabic , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[5]  Andrew M. Gillies,et al.  Arabic Text Recognition System , 2007 .

[6]  Daoud Berkani,et al.  RECOGNITION SYSTEM FOR PRINTED MULTI-FONT AND MULTI-SIZE ARABIC CHARACTERS , 2002 .

[7]  Robert M. Haralick,et al.  A segmentation-free approach to text recognition with application to Arabic text , 1996, International Journal on Document Analysis and Recognition.

[8]  Ashraf Elnagar,et al.  A Multi-Agent Approach to Arabic Handwritten Text Segmentation , 2012 .

[9]  Behrooz Parhami,et al.  Automatic recognition of printed Farsi texts , 1981, Pattern Recognit..

[10]  Adnan Amin,et al.  Recognition of printed arabic text based on global features and decision tree learning techniques , 2000, Pattern Recognit..

[11]  Luiz Eduardo Soares de Oliveira,et al.  Filtering segmentation cuts for digit string recognition , 2008, Pattern Recognit..

[12]  Sabri A. Mahmoud,et al.  Survey and bibliography of Arabic optical text recognition , 1995, Signal Process..

[13]  Ahmed Sharaf Eldin,et al.  Arabic character recognition: a survey , 1998, Defense + Commercial Sensing.

[14]  Adnan Amin,et al.  Off-line Arabic character recognition: the state of the art , 1998, Pattern Recognit..

[15]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[16]  Ashraf Elnagar,et al.  Segmentation of connected handwritten numeral strings , 2003, Pattern Recognit..

[17]  Ching Y. Suen,et al.  Character Recognition Systems: A Guide for Students and Practitioners , 2007 .

[18]  Karim Faez,et al.  Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM , 2001, Pattern Recognit..