End-to-End Model Based on Bidirectional LSTM and CTC for Online Handwritten Mongolian Word Recognition

An end-to-end model for Traditional Mongolian online handwritten word recognition is proposed in this paper. According to the characteristics of input and output data, the proposed model consists of a bidirectional Long Short-Term Memory(LSTM) network and a Connectionist Temporal Classification(CTC) network. Bidirectional LSTM network is the core of the model, and the CTC network is added to LSTM network. The key step of this research is to switch from the LSTM network output to the conditional probability distribution on the label sequence through the CTC layer. Therefore, for each given input sequence, the model completes the recognition task by choosing the most possible label. In addition, There is not many researchs on online handwritten Mongolian recognition. Therefore, in this study, we will also focus on recognizing wrong labels, finding out the types of errors, and analyzing the possible causes of errors.

[1]  Hui Zhang,et al.  End-to-End Model for Offline Handwritten Mongolian Word Recognition , 2019, NLPCC.

[2]  Guanglai Gao,et al.  Sub-Word Based Mongolian Offline Handwriting Recognition , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[3]  Hui Zhang,et al.  End-to-End Model Based on Bidirectional LSTM and CTC for Segmentation-free Traditional Mongolian Recognition , 2019, 2019 Chinese Control Conference (CCC).

[4]  Xunying Liu,et al.  CNN-RNN-CTC Based End-to-end Mispronunciation Detection and Diagnosis , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  C. V. Jawahar,et al.  Improving CNN-RNN Hybrid Networks for Handwriting Recognition , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[6]  Jinyu Li,et al.  Advancing Connectionist Temporal Classification with Attention Modeling , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Yue Lu,et al.  Handwritten Digit String Recognition by Combination of Residual Network and RNN-CTC , 2017, ICONIP.

[8]  Hui Zhang,et al.  Representing word image using visual word embeddings and RNN for keyword spotting on historical document images , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[9]  Guanglai Gao,et al.  A keyword retrieval system for historical Mongolian document images , 2014, International Journal on Document Analysis and Recognition (IJDAR).

[10]  Edmondo Trentin,et al.  A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[12]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Guanglai Gao,et al.  DNN-HMM for Large Vocabulary Mongolian Offline Handwriting Recognition , 2016, ICFHR.