A Novel Baseline-independent Feature Set for Arabic Handwriting Recognition

HMM-based analytical methods have been widely used for Arabic handwriting recognition. A key factor influencing the performance of HMM-based systems is the features extracted from a sliding window. In this paper, we propose a novel baseline-independent feature set extracted from a wider sliding window to directly capture the contextual information. This feature set is a combination of center of mass based log-space distribution features and inverse percentile features. Center of mass based log-space distribution features use a normalized histogram to describe the distribution of foreground pixels in different direction and distances with respect to the center of mass. Experiments on the IFN/ENIT database demonstrate the effectiveness of the proposed feature set. Further, this feature set can be combined with some popular baseline-independent features to form a large feature set, which achieves comparable results with several state-of-the-art systems using a simple HMM-based architecture.

[1]  Volker Märgner,et al.  Arabic Handwriting Recognition Competition , 2005, ICDAR.

[2]  Chafic Mokbel,et al.  Combining Slanted-Frame Classifiers for Improved HMM-Based Arabic Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Volker Märgner,et al.  Handwritten Arabic Word Recognition Using the IFN / ENIT - database , 2012 .

[4]  Sherif Abdelazeem,et al.  HMM-based Offline Arabic Handwriting Recognition: Using New Feature Extraction and Lexicon Ranking Techniques , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[5]  Najoua Essoukri Ben Amara,et al.  Planar Markov modeling for Arabic writing recognition: advancement state , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[6]  Pradeep Natarajan,et al.  Baseline Dependent Percentile Features for Offline Arabic Handwriting Recognition , 2011, 2011 International Conference on Document Analysis and Recognition.

[7]  Christopher Kermorvant,et al.  Features for HMM-Based Arabic Handwritten Word Recognition Systems , 2012 .

[8]  F. Perronnin,et al.  Local gradient histogram features for word spotting in unconstrained handwritten documents , 2008 .

[9]  Geetha Srikantan,et al.  A multiple feature/resolution approach to handprinted digit and character recognition , 1996 .

[10]  Rohit Prasad,et al.  Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach , 2006, SACH.

[11]  Richard M. Schwartz,et al.  Multilingual Machine Printed OCR , 2001, Int. J. Pattern Recognit. Artif. Intell..

[12]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[13]  Adrià Giménez,et al.  Arabic Handwriting Recognition Using Bernoulli HMMs , 2012 .

[14]  Kin-Man Lam,et al.  Combination of global and local baseline-independent features for offline Arabic handwriting recognition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[15]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[16]  P. Adibi,et al.  NASTAALIGH HANDWRITTEN WORD RECOGNITION USING A CONTINUOUS-DENSITY VARIABLE-DURATION HMM , 2005 .

[17]  Hermann Ney,et al.  RWTH OCR: A Large Vocabulary Optical Character Recognition System for Arabic Scripts , 2012 .