A hybrid method for three segmentation level of handwritten Arabic script

The main theme of this paper is the segmentation of handwritten Arabic script into blocks, connected components and characters using a combination between Hough Transform and Mathematical Morphology tools. We start by a segmentation methodology of a complex document into its distinct entities namely handwritten components. Every extracted handwritten blocks are then segmented into sub-words as a main specificity of Arabic script. Finally a character segmentation method is presented. For each segmentation step, some concepts are needed such as dynamic kernel and Harris corner detectors. The proposed method is tested on the CENPARMI Arabic check database. We present a concept for automatic evaluation of the results, based on label tools for the different parts of used documents.

[1]  Laurence Likforman-Sulem,et al.  A Hough based algorithm for extracting text lines in handwritten documents , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[2]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[3]  Hamid Amiri,et al.  Arabic Handwritten Words Recognition Based on a Planar Hidden Markov Model , 2005, Int. Arab J. Inf. Technol..

[4]  Jianmin Jiang,et al.  Component-based Segmentation of words from handwritten Arabic text , 2009 .

[5]  Ioannis Pratikakis,et al.  A Block-Based Hough Transform Mapping for Text Line Detection in Handwritten Documents , 2006 .

[6]  Ching Y. Suen,et al.  Automatic segmentation and recognition system for handwritten dates on Canadian bank cheques , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7]  Mokhtar Sellami,et al.  State-of-the-Art of Off-Line Arabic Handwriting Segmentation , 2007, Int. J. Comput. Process. Orient. Lang..

[8]  Abderrazak Zahour,et al.  Arabic hand-written text-line extraction , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[9]  Abdel Belaïd,et al.  Combination of local and global vision modelling for Arabic handwritten words recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[10]  Ching Y. Suen,et al.  Extraction of bankcheck items by mathematical morphology , 1999, International Journal on Document Analysis and Recognition.

[11]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[12]  Ehsanollah Kabir,et al.  A new segmentation technique for omnifont Farsi text , 2001, Pattern Recognit. Lett..

[13]  Mohammad S. Khorsheed,et al.  Recognising handwritten Arabic manuscripts using a single hidden Markov model , 2003, Pattern Recognit. Lett..

[14]  Mokhtar Sellami,et al.  Off-line handwritten Arabic character segmentation algorithm: ACSA , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[15]  Madasu Hanmandlu,et al.  Automatic Extraction of Signatures from Bank Cheques and Other Documents , 2003, DICTA.

[16]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[17]  Ching Y. Suen,et al.  Word separation in handwritten legal amounts on bank cheques based on spatial gap distances , 2004 .