Large Vocabulary Arabic Online Handwriting Recognition System

Arabic handwriting is a consonantal and cursive writing. The analysis of Arabic script is further complicated due to obligatory dots/strokes that are placed above or below most letters and usually written delayed in order. Due to ambiguities and diversities of writing styles, recognition systems are generally based on a set of possible words called lexicon. When the lexicon is small, recognition accuracy is more important as the recognition time is minimal. On the other hand, recognition speed as well as the accuracy are both critical when handling large lexicons. Arabic is rich in morphology and syntax which makes its lexicon large. Therefore, a practical online handwriting recognition system should be able to handle a large lexicon with reasonable performance in terms of both accuracy and time. In this paper, we introduce a fully-fledged Hidden Markov Model (HMM) based system for Arabic online handwriting recognition that provides solutions for most of the difficulties inherent in recognizing the Arabic script. A new preprocessing technique for handling the delayed strokes is introduced. We use advanced modeling techniques for building our recognition system from the training data to provide more detailed representation for the differences between the writing units, minimize the variances between writers in the training data and have a better representation for the features space. System results are enhanced using an additional post-processing step with a higher order language model and cross-word HMM models. The system performance is evaluated using two different databases covering small and large lexicons. Our system outperforms the state-of-art systems for the small lexicon database. Furthermore, it shows promising results (accuracy and time) when supporting large lexicon with the possibility for adapting the models for specific writers to get even better results.

[1]  Volker Märgner,et al.  On-line Arabic handwriting recognition competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[2]  Sherif Abdelazeem,et al.  On-line Arabic Handwritten Personal Names Recognition System Based on HMM , 2011, 2011 International Conference on Document Analysis and Recognition.

[3]  A. Leroy LEXICON REDUCTION BASED ON GLOBAL FEATURES FOR ON-LINE HANDWRITING , 2007 .

[4]  M. Tahar Kechadi,et al.  Preprocessing Techniques for Online Handwriting Recognition , 2007, Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007).

[5]  Alexander H. Waibel,et al.  Online handwriting recognition: the NPen++ recognizer , 2001, International Journal on Document Analysis and Recognition.

[6]  Réjean Plamondon,et al.  Segmenting Handwritten Signatures at Their Perceptually Important Points , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Steve Young,et al.  The HTK book , 1995 .

[8]  Venu Govindaraju,et al.  A Novel Lexicon Reduction Method for Arabic Handwriting Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[9]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[10]  Réjean Plamondon,et al.  Normalizing and restoring on-line handwriting , 1993, Pattern Recognit..

[11]  Richard M. Schwartz,et al.  On-line cursive handwriting recognition using speech recognition methods , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Mohammad S. Khorsheed,et al.  Recognising handwritten Arabic manuscripts using a single hidden Markov model , 2003, Pattern Recognit. Lett..

[13]  Sherif Abdou,et al.  Using advanced Hidden Markov Models for online Arabic handwriting recognition , 2011, The First Asian Conference on Pattern Recognition.

[14]  Amar Mitiche,et al.  On-line recognition of handwritten Arabic characters using a Kohonen neural network , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[15]  Dave Elliman,et al.  Off-line recognition of handwritten Arabic words using multiple hidden Markov models , 2004, Knowl. Based Syst..

[16]  Nizar Habash,et al.  Online Arabic Handwriting Recognition Using Hidden Markov Models , 2006 .

[17]  Samia A. Mashali,et al.  Simultaneous Segmentation and Recognition of Arabic Characters in an Unconstrained On-Line Cursive Handwritten Document , 2007 .

[18]  Samir Al-Emami,et al.  On-Line Recognition of Handwritten Arabic Characters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  K. Assaleh,et al.  Online Arabic handwriting recognition using continuous Gaussian mixture HMMS , 2007, 2007 International Conference on Intelligent and Advanced Systems.

[20]  Massimo Tistarelli,et al.  Advances in Biometrics , 2009, Lecture Notes in Computer Science.

[21]  Nikos Fakotakis,et al.  An unconstrained handwriting recognition system , 2002, International Journal on Document Analysis and Recognition.

[22]  Amin A. Shoukry,et al.  On-line recognition of handwritten isolated arabic characters , 1989, Pattern Recognit..

[23]  Biing-Hwang Juang,et al.  An Overview of Automatic Speech Recognition , 1996 .

[24]  Sherif Abdelazeem,et al.  An On-line Arabic Handwriting Recognition System: Based on a New On-line Graphemes Segmentation Technique , 2011, 2011 International Conference on Document Analysis and Recognition.

[25]  Hussein Almuallim,et al.  A Method of Recognition of Arabic Cursive Handwriting , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Deyu Zhou,et al.  Discriminative Training of the Hidden Vector State Model for Semantic Parsing , 2009, IEEE Transactions on Knowledge and Data Engineering.

[27]  Jin Hyung Kim,et al.  Context dependent search in interconnected hidden Markov model for unconstrained handwriting recognition , 1995, Pattern Recognit..

[28]  Adel M. Alimi,et al.  Evolutionary Computation for the Recognition of On-Line Cursive Handwriting , 2002 .

[29]  Sherif Abdou,et al.  AltecOnDB: A Large-Vocabulary Arabic Online Handwriting Recognition Database , 2014, ArXiv.

[30]  Richard M. Schwartz,et al.  A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[31]  Simon King,et al.  IEEE Workshop on automatic speech recognition and understanding , 2009 .

[32]  Jianying Hu,et al.  Writer independent on-line handwriting recognition using an HMM approach , 2000, Pattern Recognit..

[33]  Ahmad T. Al-Taani An Efficient Feature Extraction Algorithm for the Recognition of Handwritten Arabic Digits , 2008 .

[34]  Adel M. Alimi,et al.  An evolutionary neuro-fuzzy approach to recognize on-line Arabic handwriting , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[35]  Hany Ahmed,et al.  On-line Arabic Handwriting Recognition System Based on HMM , 2011, 2011 International Conference on Document Analysis and Recognition.

[36]  Mark J. F. Gales Adaptive training for robust ASR , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[37]  Adel M. Alimi,et al.  On-line Arabic handwriting recognition system based on visual encoding and genetic algorithm , 2009, Eng. Appl. Artif. Intell..

[38]  Khaled Daifallah,et al.  Recognition-Based Segmentation Algorithm for On-Line Arabic Handwriting , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[39]  Volker Märgner,et al.  On-line Arabic handwriting recognition competition , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[40]  Rabab Kreidieh Ward,et al.  A novel invariant mapping applied to hand-written arabic character recognition , 2001, Pattern Recognit..

[41]  Habibollah Haron,et al.  The evolution and trend of chain code scheme , 2008 .

[42]  Jin-Young Ha,et al.  Unconstrained handwritten word recognition with interconnected hidden markov models = 상호 연결된 은닉 마르코프 모델을 이용한 무제약 필기 단어 인식 , 1994 .

[43]  Steve Young,et al.  The HTK book version 3.4 , 2006 .