Recognition of traditional Mongolian script using primitives and template matching methods

Designing an appropriate algorithm and methods in case of speed and accuracy for character recognition has become a necessity in regard to importance of character recognition for various purposes such as: keeping, maintaining, and promoting the one’s cultural and historical heritages, scriptures etc. Traditional Mongolian script, which has a unique writing style and multi-font variations, brings challenges to character recognition. In this paper we primarily studied an Optical Character Recognition (OCR) of a typewritten and woodcut printed Mongolian Script by using primitives and template matching methods. Template matching method has two phases which are separating letters and then recognizing them each of which are processed separately, whereas in the Primitive method separation and recognition are done simultaneously. We developed a software and tested the template matching (TM) method. This method worked well with typewritten documents only whit certain fonts but couldn’t do so well on woodcut print recognition. So further, we have developed an algorithm for recognition of Mongolian script by discomposing them into containing primitives. We assume that all Mongolian script letters contain 7-primitive elements. At first primitive elements are extracted by using the modified Hough Transform method and make the primitive arrays. And then these elements from first to end of the array are compared with Character Identification Vector (CIV) and recognizes the characters. The primitive method is able to recognize any type of printed document with higher accuracy and more efficient than the method.

[1]  Suvdaa Batsuuri,et al.  Traditional Mongolian Script Feature Extraction based on Black Pixels in Bounding Box , 2017 .

[2]  Roland T. Chin,et al.  One-Pass Parallel Thinning: Analysis, Properties, and Quantitative Evaluation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Hua Wang,et al.  Multi-font printed Mongolian document recognition system , 2010, Electronic Imaging.

[4]  Suvdaa Batsuuri,et al.  Line Profile Based Fast Approach for Recognizing Traditional Mongolian Script , 2014, 2014 7th International Conference on Ubi-Media Computing and Workshops.

[5]  Hamid Amiri,et al.  Generalized hough transform for arabic optical character recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[6]  Mohammad Eshghi,et al.  Recognition of separate and adjoint Persian letters using primitives , 2009, 2009 IEEE Symposium on Industrial Electronics & Applications.