Sub-Word Based Mongolian Offline Handwriting Recognition

Mongolian is an agglutinative language, which re-sults in a large number of words derived from the same stems connecting different suffixes. This morphological richness leads to high out-of-vocabulary (OOV) rates and causes problems of data sparsity. In this paper, our proposed recognition system is composed of three parts: handwritten image preprocessing, mapping of images to grapheme sequences, and sub-word-based language model(LM) decoding. We present a sub-word-based n-gram LM to solve the high OOV rate problem. According to the characteristics of Mongolian, we modified the traditional token passing algorithm to improve decoding speed and to easy to combine with any n-gram LM. We evaluated the performance of sub-words at different levels on the open Mongolian offline handwriting dataset(MHW). The bi-syllable 2-gram LM showed the best performance, with 18.32% and 23.22% word-error rates (WERs) on two test sets. Our various experiments show that, this method can predict in vocabulary words with a higher accuracy rate and also predict OOV words with a certain accuracy rate.

[1]  Volker Märgner,et al.  ICDAR 2011 - Arabic Handwriting Recognition Competition , 2011, ICDAR.

[2]  Guanglai Gao,et al.  DNN-HMM for Large Vocabulary Mongolian Offline Handwriting Recognition , 2016, ICFHR.

[3]  Guanglai Gao,et al.  A keyword retrieval system for historical Mongolian document images , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[4]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[5]  Daniel Jurafsky,et al.  Lexicon-Free Conversational Speech Recognition with Neural Networks , 2015, NAACL.

[6]  Andrew W. Senior,et al.  Flat start training of CD-CTC-SMBR LSTM RNN acoustic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[8]  Hermann Ney,et al.  Hybrid Language Models Using Mixed Types of Sub-Lexical Units for Open Vocabulary German LVCSR , 2011, INTERSPEECH.

[9]  Gao Guanglai,et al.  DNN-HMM for Large Vocabulary Mongolian Offline Handwriting Recognition , 2016, ICFHR 2016.

[10]  Fei Yin,et al.  ICDAR 2011 Chinese Handwriting Recognition Competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[11]  Yajie Miao,et al.  EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Hua Wang,et al.  Multi-font printed Mongolian document recognition system , 2010, Electronic Imaging.

[14]  Hermann Ney,et al.  Open vocabulary handwriting recognition using combined word-level and character-level language models , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Guanglai Gao,et al.  A knowledge-based recognition system for historical Mongolian documents , 2016, International Journal on Document Analysis and Recognition (IJDAR).

[16]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[17]  Imran Siddiqi,et al.  Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks , 2016, Neurocomputing.

[18]  Herbert Jaeger,et al.  A tutorial on training recurrent neural networks , covering BPPT , RTRL , EKF and the " echo state network " approach - Semantic Scholar , 2005 .