Offline handwritten Amharic word recognition

This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained.

[1]  Mohammad S. Khorsheed,et al.  Recognising handwritten Arabic manuscripts using a single hidden Markov model , 2003, Pattern Recognit. Lett..

[2]  Yaregal Assabie,et al.  Multifont size-resilient recognition system for Ethiopic script , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[3]  Steve Young,et al.  The HTK book , 1995 .

[4]  Hiromichi Fujisawa,et al.  Forty years of research in character and document recognition - an industrial perspective , 2008, Pattern Recognit..

[5]  A. Gérard African language literatures: An introduction to the literary history of Sub-Saharan Africa , 1981 .

[6]  Mitra Basu,et al.  Gaussian derivative model for edge enhancement , 1994, Pattern Recognit..

[7]  Josef Bigün Vision with direction - a systematic introduction to image processing and computer vision , 2006 .

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  Robert Sabourin,et al.  A hybrid large vocabulary handwritten word recognition system using neural networks with hidden Markov models , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[10]  David Doermann,et al.  A New Algorithm for Detecting Text Line in Handwritten Documents , 2006 .

[11]  Sargur N. Srihari,et al.  High-performance reading machines , 1992 .

[12]  Sargur N. Srihari,et al.  Machine-printed Japanese document recognition , 1997, Pattern Recognit..

[13]  María José Castro Bleda,et al.  Holistic cursive word recognition based on perceptual features , 2007, Pattern Recognit. Lett..

[14]  Daming Shi,et al.  Offline handwritten Chinese character recognition by radical decomposition , 2003, TALIP.

[15]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Rabab K. Ward,et al.  Character Recognition Systems for the non-expert , 1999 .

[17]  Jake K. Aggarwal,et al.  Feature extraction of edge by directional computation of gray-scale variation , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[18]  K.D. Baker,et al.  Efficient image gradient-based object localisation and recognition , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Sabri A. Mahmoud,et al.  Recognition : A Survey , 2013 .

[20]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Sargur N. Srihari,et al.  Word segmentation of off-line handwritten documents , 2008, Electronic Imaging.

[22]  Venu Govindaraju,et al.  The Role of Holistic Paradigms in Handwritten Word Recognition , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Sarah L. Nesbeitt Ethnologue: Languages of the World , 1999 .

[24]  Robert Sabourin,et al.  An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Samy Bengio,et al.  Offline recognition of unconstrained handwritten texts using HMMs and statistical language models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Jinhai Cai,et al.  Handwriting Recognition - Soft Computing and Probabilistic Approaches , 2003, Studies in Fuzziness and Soft Computing.

[27]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Yaregal Assabie,et al.  Lexicon-based offline recognition of Amharic words in unconstrained handwritten text , 2008, 2008 19th International Conference on Pattern Recognition.

[29]  Ching Y. Suen,et al.  Analysis and recognition of Asian scripts-the state of the art , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[30]  Ioannis Pratikakis,et al.  Text line and word segmentation of handwritten documents , 2009, Pattern Recognit..

[31]  Horst Bunke,et al.  Recognition of cursive Roman handwriting: past, present and future , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[32]  P. Lewis Ethnologue : languages of the world , 2009 .

[33]  Robert Sabourin,et al.  Lexicon-driven HMM decoding for large vocabulary handwriting recognition with multiple character models , 2003, Document Analysis and Recognition.

[34]  Seiichi Uchida,et al.  Eigen-deformations for elastic matching based handwritten character recognition , 2003, Pattern Recognit..

[35]  Giovanni Soda,et al.  Artificial neural networks for document analysis and recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Its'hak Dinstein,et al.  2009 10th International Conference on Document Analysis and Recognition Line segmentation for degraded handwritten historical documents , 2022 .

[37]  C. V. Jawahar,et al.  Recognition of printed Amharic documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[38]  Nafiz Arica,et al.  An overview of character recognition focused on off-line handwriting , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[39]  J. Bigun,et al.  Optimal Orientation Detection of Linear Symmetry , 1987, ICCV 1987.

[40]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[41]  Josef Bigün,et al.  Recognition by symmetry derivatives and the generalized structure tensor , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[43]  Cheng-Lin Liu,et al.  Classification and Learning Methods for Character Recognition: Advances and Remaining Problems , 2008, Machine Learning in Document Analysis and Recognition.