Comparison of Different Preprocessing and Feature Extraction Methods for Offline Recognition of Handwritten ArabicWords

Preprocessing and feature extraction are very important steps in automatic cursive handwritten word recognition. Based on an offline recognition system for Arabic handwritten words which uses a semi-continuous 1-dimensional Hidden Markov Model recognizer, different preprocessing combined with different feature sets are presented. The dependencies of the feature sets from preprocessing steps are discussed and their performances are compared using the IFN/ENIT-database of handwritten Arabic words. As the lower and upper baseline of each word are part of the ground truth of the database, the dependency of the feature set from the accuracy of the estimated baseline is evaluated.

[1]  Volker Märgner,et al.  Arabic Handwriting Recognition Competition , 2005, ICDAR.

[2]  Afonso Ferreira,et al.  Ultra-fast parallel contour tracking, with applications to thinning , 1994, Pattern Recognit..

[3]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Haikal El Abed,et al.  Comparison of Two Different Feature Sets for Offline Recognition of Handwritten Arabic Words , 2006 .

[5]  Chafic Mokbel,et al.  Arabic handwriting recognition using baseline dependant features and hidden Markov modeling , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[6]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7]  Volker Märgner,et al.  Baseline estimation for Arabic handwritten words , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[8]  Sabri A. Mahmoud,et al.  Survey and bibliography of Arabic optical text recognition , 1995, Signal Process..

[9]  Adnan Amin,et al.  Off-line Arabic character recognition: the state of the art , 1998, Pattern Recognit..

[10]  R. Bippus,et al.  Cursive script recognition using semi continuous hidden Markov models in combination with simple features , 1994 .

[11]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[12]  Volker Märgner,et al.  The IFN/ENIT-database - a tool to develop Arabic handwriting recognition systems , 2007, 2007 9th International Symposium on Signal Processing and Its Applications.

[13]  Volker Märgner,et al.  Databases and Competitions: Strategies to Improve Arabic Recognition Systems , 2006, SACH.

[14]  Volker Märgner,et al.  Script recognition using inhomogeneous P2DHMM and hierarchical search space reduction , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[15]  Karim Faez,et al.  Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM , 2001, Pattern Recognit..