White-space models for offline Arabic handwriting recognition

We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characters even within words. Here, a separate character model for white-spaces in combination with a lexicon using different writing variants and character model length adaptation is proposed. Current handwriting recognition systems model the white-spaces implicitly within the character models leading to possibly degraded models, or try to explicitly segment the Arabic words into pieces of Arabic words being prone to segmentation errors. Several white-space modeling approaches are analyzed on the well known IFN/ENIT database and outperform the best reported error rates.

[1]  Marc-Peter Schambach Model length adaptation of an HMM based cursive word recognition system , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[3]  Richard M. Schwartz,et al.  Robust language-independent OCR system , 1999, Other Conferences.

[4]  Volker Märgner,et al.  Arabic Handwriting Recognition Competition , 2005, ICDAR.

[5]  Georg Heigold,et al.  The RWTH 2007 TC-STAR evaluation system for european English and Spanish , 2007, INTERSPEECH.

[6]  M Volker,et al.  ICDAR 2007 - Arabic Handwriting Recognition Competition , 2007 .

[7]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[9]  Hermann Ney,et al.  Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Chafic Mokbel,et al.  Arabic handwriting recognition using baseline dependant features and hidden Markov modeling , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).