Offline Arabic handwriting recognition using Hidden Markov Models and Post-Recognition Lexicon Matching

Arabic handwriting recognition is fairly complex operation due to the similarities between different letters under similar writing styles. This paper presents a new approach on offline recognition of handwritten Arabic words. The method does not require the segmentation of words into characters for recognition, but requires segmentation of training data into separate letters. The method trains a Hidden Markov Model (HMM) for each letter in the alphabet along with its various writing styles and taking into consideration the letter position variations. Having different types of features to extract from each image in the dataset helps to further improve the recognition rate of the whole system. The performance of the proposed method is demonstrated after various experiments carried out using the IFN/ENIT (Institut fur Nachrichtentechnik/ Ecole Nationale d'Ingénieurs de Tunis) reference database which contains an estimate of 32,492 handwritten Arabic words with various writing styles and formats of hundreds of writers. The overall recognition rate of the system is 87% after applying various methods of fine tuning and optimizations.

[1]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2]  Jianmin Jiang,et al.  Offline handwritten Arabic cursive text recognition using Hidden Markov Models and re-ranking , 2011, Pattern Recognit. Lett..

[3]  Volker Märgner,et al.  The IFN/ENIT-database - a tool to develop Arabic handwriting recognition systems , 2007, 2007 9th International Symposium on Signal Processing and Its Applications.

[4]  Chafic Mokbel,et al.  Arabic handwriting recognition using baseline dependant features and hidden Markov modeling , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[5]  Mokhtar Sellami,et al.  HMMs with Explicit State Duration Applied to Handwritten Arabic Word Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[6]  Adel M. Alimi,et al.  2009 10th International Conference on Document Analysis and Recognition Combining Multiple HMMs Using On-line and Off-line Features for Off-line Arabic Handwriting Recognition , 2022 .

[7]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Chafic Mokbel,et al.  Combining Slanted-Frame Classifiers for Improved HMM-Based Arabic Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Volker Märgner,et al.  HMM based approach for handwritten arabic word recognition using the IFN/ENIT - database , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[10]  Dave Elliman,et al.  Off-line recognition of handwritten Arabic words using multiple hidden Markov models , 2004, Knowl. Based Syst..

[11]  Gernot A. Fink,et al.  Markov models for offline handwriting recognition: a survey , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[12]  Volker Märgner,et al.  Comparison of Different Preprocessing and Feature Extraction Methods for Offline Recognition of Handwritten ArabicWords , 2007, ICDAR.

[13]  Sabri A. Mahmoud,et al.  Recognition : A Survey , 2013 .

[14]  N. Farah,et al.  Preprocessing Algorithms for Arabic Handwriting Recognition Systems , 2012, 2012 International Conference on Advanced Computer Science Applications and Technologies (ACSAT).

[15]  Hany Ahmed,et al.  Effective technique for the recognition of offline Arabic handwritten words using hidden Markov models , 2013, International Journal on Document Analysis and Recognition (IJDAR).