A Novel Baseline Detection Method of Handwritten Arabic-Script Documents Based on Sub-Words

Baseline detection is an important process in document image analysis and recognition systems. It is extensively used to many various preprocessing stages such as text normalization, skew correction, characters segmentation, slant and slop correction as well as in feature extraction. in this work, we proposed a new method for baseline detection based on horizontal projection histogram and directions features of subwords skeleton for Arabic script; which form the main component of the text that may consist of at least one letter, in addition of diacritic and dots. The efficiency of the proposed method is has been proven by the experiment’s results on an IFN/ENIT Arabic benchmark dataset.

[1]  Khairuddin Omar,et al.  Skeletonization Algorithm for Binary Images , 2013 .

[2]  P. Nagabhushan,et al.  Tracing and straightening the baseline in handwritten persian/arabic text-line: A new approach based on painting-technique , 2010 .

[3]  Siti Norul Huda Sheikh Abdullah,et al.  Off-line Arabic Character-Based Writer Identification – a Survey , 2011 .

[4]  Karim Faez,et al.  A novel two-stage algorithm for baseline estimation and correction in Farsi and Arabic handwritten text line , 2008, 2008 19th International Conference on Pattern Recognition.

[5]  Adam Gacek,et al.  Arabic Manuscripts: A Vademecum for Readers , 2009 .

[6]  Adel M. Alimi,et al.  New Algorithm of Straight or Curved Baseline Detection for Short Arabic Handwritten Writing , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[7]  Siti Norul Huda Sheikh Abdullah,et al.  Arabic calligraphy recognition based on binarization methods and degraded images , 2011, 2011 International Conference on Pattern Analysis and Intelligence Robotics.

[8]  Venu Govindaraju,et al.  Pre-processing methods for handwritten Arabic documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[9]  Volker Märgner,et al.  Baseline estimation for Arabic handwritten words , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[10]  Behrooz Parhami,et al.  Automatic recognition of printed Farsi texts , 1981, Pattern Recognit..

[11]  Khairuddin Omar,et al.  An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows , 2011, Pattern Recognit. Lett..

[12]  Nadir Farah,et al.  A Novel Arabic Baseline Estimation Algorithm Based on Sub-Words Treatment , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[13]  Dana H. Ballard,et al.  Computer Vision , 1982 .