Writer adaptation in off-line Arabic handwriting recognition

Writer adaptation or specialization is the adjustment of handwriting recognition algorithms to a specific writer's style of handwriting. Such adjustment yields significantly improved recognition rates over counterpart general recognition algorithms. We present the first unconstrained off-line handwriting adaptation algorithm for Arabic presented in the literature. We discuss an iterative bootstrapping model which adapts a writer-independent model to a writer-dependent model using a small number of words achieving a large recognition rate increase in the process. Furthermore, we describe a confidence weighting method which generates better results by weighting words based on their length. We also discuss script features unique to Arabic, and how we incorporate them into our adaptation process. Even though Arabic has many more character classes than languages such as English, significant improvement was observed. The testing set consisting of about 100 pages of handwritten text had an initial average overall recognition rate of 67%. After the basic adaptation was finished, the overall recognition rate was 73.3%. As the improvement was most marked for the longer words, and the set of confidently recognized longer words contained many fewer false results, a second method was presented using them alone, resulting in a recognition rate of about 75%. Initially, these words had a 69.5% recognition rate, improving to about a 92% recognition rate after adaptation. A novel hybrid method is presented with a rate of about 77.2%.

[1]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Yves Lecourtier,et al.  Defining writer's invariants to adapt the recognition task , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[3]  Krishna S. Nathan,et al.  Writer adaptation of a HMM handwriting recognition system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Michael Revow,et al.  Personalization of an Online Handwriting Recognition System , 2006 .

[5]  Samy Bengio,et al.  Writer adaptation techniques in HMM based Off-Line Cursive Script Recognition , 2002, Pattern Recognit. Lett..

[6]  Harish Srinivasan,et al.  Handwritten Arabic Word Spotting using the CEDARABIC Document Analysis System , 2005 .

[7]  Michael Perrone,et al.  Writer dependent recognition of on-line unconstrained handwriting , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Anil K. Jain,et al.  Writer adaptation of online handwriting models , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[9]  Anil K. Jain,et al.  Writer Adaptation for Online Handwriting Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Sargur N. Srihari,et al.  Segmentation-Based And Segmentation-Free Methods for Spotting Handwritten Arabic Words , 2006 .

[11]  Sargur N. Srihari,et al.  Spotting words in handwritten Arabic documents , 2006, Electronic Imaging.

[12]  Thierry Paquet,et al.  Unsupervised writer adaptation applied to handwritten text recognition , 2004, Pattern Recognit..

[13]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[14]  Thierry Paquet,et al.  Handwritten text recognition through writer adaptation , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.