Offline handwritten Farsi cursive text recognition using hidden Markov models

In this paper we address the problem of recognizing Farsi handwritten words. We extract two types of features from vertical stripes on word images: chain-code of word boundary and distribution of foreground density across the image word. The extracted feature vectors are coded using self organizing vector quantization. The result codes are used for training the model of each word in the database. Each word is modeled using discrete hidden Markov models (HMM). In order to evaluate the performance of the proposed system we conducted an experiment using new prepared database FARSA. We tested the proposed method using 198 word classes in this database. The result of experiment in compare with the existing methods is very promising.

[1]  Mokhtar Sellami,et al.  Semi-continuous HMMs with explicit state duration for unconstrained Arabic word modeling and recognition , 2008, Pattern Recognit. Lett..

[2]  Jinchang Ren,et al.  Arabic Cursive Text Recognition using 1 Hidden Markov Models and Re-ranking 2 3 4 , 2011 .

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Jianmin Jiang,et al.  Offline handwritten Arabic cursive text recognition using Hidden Markov Models and re-ranking , 2011, Pattern Recognit. Lett..

[5]  S. Alirezaee,et al.  Off-line Farsi / arabic handwritten word recognition using vector quantization and hidden Markov model , 2008, 2008 IEEE International Multitopic Conference.

[6]  Saeed Mozaffari,et al.  Lexicon reduction using dots for off-line Farsi/Arabic handwritten word recognition , 2008, Pattern Recognit. Lett..

[7]  Ching Y. Suen,et al.  A New Large-Scale Multi-purpose Handwritten Farsi Database , 2009, ICIAR.

[8]  Karim Faez,et al.  Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM , 2001, Pattern Recognit..

[9]  Z. Zhao,et al.  Use of Kohonen self-organising feature maps for HMM parameter smoothing in speech recognition , 1992 .

[10]  Karim Faez,et al.  Unconstrained Farsi handwritten word recognition using fuzzy vector quantization and hidden Markov models , 2001, Pattern Recognit. Lett..

[11]  Reza Azmi,et al.  Off-line Arabic/Farsi handwritten word recognition using RBF neural network and genetic algorithm , 2010, 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[12]  Haikal El Abed,et al.  Guide to OCR for Arabic Scripts , 2012, Springer London.