Recognition of Off-Line Arabic Handwriting Words Using HMM Toolkit (HTK)

There are a lot of difficulties facing a good handwritten Arabic recognition system such as the similarities of different character shapes and the unlimited variants in human handwriting. This paper presents a handwriting Arabic word recognition system. The objective of this approach is to propose an analytical offline recognition method of handwritten Arabic for rapid implementation. The first part in the writing recognition system is the preprocessing phase that prepares the data which serves to introduce and extract a set of simple statistical features by a window sliding along that text line from the right to left, then it injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). In the recognition phase, the concatenation of characters to form words is modelled by simple lexical models, each word is modelled by a stochastic finite-state automaton (SFSA). The proposed system is applied to an "Arabic-Numbers" data corpus, which contains 47 words and 1905 sentences. These sentences are written by five different peoples.

[1]  A. Brakensiek,et al.  OFF-LINE HANDWRITING RECOGNITION USING VARIOUS HYBRID MODELING TECHNIQUES AND CHARACTER N-GRAMS , 2004 .

[2]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[3]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Mohammad S. Khorsheed,et al.  Recognising handwritten Arabic manuscripts using a single hidden Markov model , 2003, Pattern Recognit. Lett..

[5]  Y. Lecourtier,et al.  Coupling observation/letter for a Markovian modelisation applied to the recognition of Arabic handwriting , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  Abdel Belaïd,et al.  Utilisation des modèles markoviens en reconnaissance de l'écriture arabe : Etat de l'art , 2000 .

[7]  Jim R. Parker,et al.  Algorithms for image processing and computer vision , 1996 .

[8]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[9]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[10]  Richard M. Schwartz,et al.  A Script-Independent Methodology For Optical Character Recognition , 1998, Pattern Recognit..

[11]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Moisés Pastor Gadea Aportaciones al reconocimiento automático de texto manuscrito , 2008 .

[13]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[14]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[15]  Alejandro Héctor Toselli,et al.  Projection Profile Based Algorithm for Slant Removal , 2004, ICIAR.