Semi-continuous HMMs with explicit state duration for unconstrained Arabic word modeling and recognition

In this paper, we describe an off-line unconstrained handwritten Arabic word recognition system based on segmentation-free approach and semi-continuous hidden Markov models (SCHMMs) with explicit state duration. Character durations play a significant part in the recognition of cursive handwriting. The duration information is still mostly disregarded in HMM-based automatic cursive handwriting recognizers due to the fact that HMMs are deficient in modeling character durations properly. We will show experimentally that explicit state duration modeling in the SCHMM framework can significantly improve the discriminating capacity of the SCHMMs to deal with very difficult pattern recognition tasks such as unconstrained handwritten Arabic recognition. In order to carry out the letter and word model training and recognition more efficiently, we propose a new version of the Viterbi algorithm taking into account explicit state duration modeling. Three distributions (Gamma, Gauss and Poisson) for the explicit state duration modeling have been used and a comparison between them has been reported. To perform word recognition, the described system uses an original sliding window approach based on vertical projection histogram analysis of the word and extracts a new pertinent set of statistical and structural features from the word image. Several experiments have been performed using the IFN/ENIT benchmark database and the best recognition performances achieved by our system outperform those reported recently on the same database.

[1]  Mou-Yen Cheii,et al.  Variable Duration Hidden Markov Model and Morphological Segmentation for Handwritten Word Recognition , 1993 .

[2]  Adnan Amin,et al.  Off-line Arabic character recognition: the state of the art , 1998, Pattern Recognit..

[3]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Abdel Belaïd,et al.  Printed PAW recognition based on planar hidden Markov models , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[5]  Paul D. Gader,et al.  Handwritten Word Recognition Using Segmentation-Free Hidden Markov Modeling and Segmentation-Based Dynamic Programming Techniques , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Roberto Pieraccini,et al.  Dynamic planar warping for optical character recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[8]  Michael Blumenstein,et al.  New Preprocessing Techniques for Handwritten Word Recognition , 2002 .

[9]  Yoshua Bengio,et al.  Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models , 1993 .

[10]  Stephen E. Levinson,et al.  Continuously variable duration hidden Markov models for automatic speech recognition , 1986 .

[11]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Adnan Amin,et al.  Hand printed Arabic character recognition system , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[13]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[14]  Robert Sabourin,et al.  Recognition and verification of unconstrained handwritten words , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Xianglong Tang,et al.  A new algorithm for machine printed Arabic character segmentation , 2004, Pattern Recognit. Lett..

[16]  Adnan Amin,et al.  Hand-printed arabic character recognition system using an artificial network , 1996, Pattern Recognit..

[17]  Haikal El Abed,et al.  Comparison of Two Different Feature Sets for Offline Recognition of Handwritten Arabic Words , 2006 .

[18]  Mokhtar Sellami,et al.  A Hybrid Neuro-Symbolic Approach for Arabic Handwritten Word Recognition , 2006, J. Adv. Comput. Intell. Intell. Informatics.

[19]  Mokhtar Sellami,et al.  Classifiers combination and syntax analysis for Arabic literal amount recognition , 2006, Eng. Appl. Artif. Intell..

[20]  Ramjee Prasad,et al.  Hidden Markov models applied to on-line handwritten isolated character recognition , 1994, IEEE Trans. Image Process..

[21]  Samy Bengio,et al.  Offline cursive word recognition using continuous density hidden Markov models trained with PCA or ICA features , 2002, Object recognition supported by user interaction for service robots.

[22]  Mohammad S. Khorsheed,et al.  Recognising handwritten Arabic manuscripts using a single hidden Markov model , 2003, Pattern Recognit. Lett..

[23]  Abdel Belaïd,et al.  Printed PAW Reco ased on Planar Hidden Markov Models , 1996 .

[24]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[25]  S. Srihari,et al.  Variable duration hidden markov model and morphological segmentation for handwritten word recognition , 1995, IEEE Transactions on Image Processing.

[26]  Venu Govindaraju,et al.  The Role of Holistic Paradigms in Handwritten Word Recognition , 2009 .

[27]  Robert Sabourin,et al.  Lexicon-driven HMM decoding for large vocabulary handwriting recognition with multiple character models , 2003, Document Analysis and Recognition.

[28]  Sargur N. Srihari,et al.  Handwritten word recognition using continuous density variable duration hidden Markov model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Samy Bengio,et al.  Offline recognition of unconstrained handwritten texts using HMMs and statistical language models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Robert Sabourin,et al.  An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Chafic Mokbel,et al.  Arabic handwriting recognition using baseline dependant features and hidden Markov modeling , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[32]  Najoua Essoukri Ben Amara,et al.  Classification of Arabic script using multiple sources of information: State of the art and perspectives , 2003, Document Analysis and Recognition.

[33]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[34]  Mokhtar Sellami,et al.  Semi-Continuous HMMs with Explicit State Duration Applied to Arabic Handwritten Word Recognition , 2006 .

[35]  Mounim A. El-Yacoubi,et al.  A Statistical Approach for Phrase Location and Recognition within a Text Line: An Application to Street Name Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Venu Govindaraju,et al.  Pre-processing methods for handwritten Arabic documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[37]  L. Yang,et al.  Application of hidden Markov models for signature verification , 1995, Pattern Recognit..

[38]  Volker Märgner,et al.  ICDAR 2009-Arabic handwriting recognition competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[39]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Yoshua Bengio,et al.  Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models , 1993, NIPS.

[41]  Alessandro Vinciarelli,et al.  A survey on off-line Cursive Word Recognition , 2002, Pattern Recognit..

[42]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[43]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.