Deep neural network based hidden Markov model for offline handwritten Chinese text recognition

This paper proposes a novel segmentation-free approach using deep neural network based hidden Markov model (DNN-HMM) for offline handwritten Chinese text recognition. In the general Bayesian framework, three key issues are comprehensively investigated, namely feature extraction, character modeling, and language modeling. First, as for the feature extraction on the basis of each frame or sliding window, the gradient-based features are extracted for the DNN-based classifier. Second, the text line is sequentially modeled by HMMs with each representing one character class. Meanwhile the DNN-based classifier is adopted to calculate the posterior probability of all HMM states. Finally, the character n-gram language model is integrated with the DNN-HMM character model for the Bayesian decision. The experiments on the ICDAR 2013 competition task of CASIA-HWDB database show that the proposed approach can achieve the best published recognition results to our knowledge, yielding a character error rate (CER) of 6.50%, which significantly outperforms the previously best reported oversegmentation approach (with a CER of 9.25%) and the segmentation-free approach using multidimensional long-short term memory recurrent neural network (MDLSTM-RNN) approach (with a CER of 10.6%).

[1]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[2]  Fei Yin,et al.  CASIA Online and Offline Chinese Handwriting Databases , 2011, 2011 International Conference on Document Analysis and Recognition.

[3]  Lianwen Jin,et al.  Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[4]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[5]  Johan Schalkwyk,et al.  OpenFst: A General and Efficient Weighted Finite-State Transducer Library , 2007, CIAA.

[6]  Jun Du,et al.  Writer adaptive feature extraction based on convolutional neural networks for online handwritten Chinese character recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[7]  Hiromichi Fujisawa,et al.  Forty years of research in character and document recognition - an industrial perspective , 2008, Pattern Recognit..

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Hiroshi Sako,et al.  Discriminative learning quadratic discriminant function for handwriting recognition , 2004, IEEE Transactions on Neural Networks.

[10]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[11]  Chunheng Wang,et al.  Handwritten Chinese address recognition , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[12]  Cheng-Lin Liu,et al.  Handwritten digit recognition: benchmarking of state-of-the-art techniques , 2003, Pattern Recognit..

[13]  Lianwen Jin,et al.  High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[14]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[15]  Fei Yin,et al.  ICDAR 2011 Chinese Handwriting Recognition Competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[16]  Dan Ciresan,et al.  Multi-Column Deep Neural Networks for offline handwritten Chinese character classification , 2013, 2015 International Joint Conference on Neural Networks (IJCNN).

[17]  Masaki Nakagawa,et al.  'Online recognition of Chinese characters: the state-of-the-art , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Benjamin Graham,et al.  Sparse arrays of signatures for online character recognition , 2013, ArXiv.

[19]  Fei Yin,et al.  Handwritten Chinese Text Recognition by Integrating Multiple Contexts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[21]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[22]  Tong Liu,et al.  A Novel Segmentation and Recognition Algorithm for Chinese Handwritten Address Character Strings , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[23]  Biing-Hwang Juang,et al.  Maximum likelihood estimation for multivariate mixture observations of markov chains , 1986, IEEE Trans. Inf. Theory.

[24]  Cheng-Lin Liu,et al.  Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Jun Sun,et al.  Handwritten Character Recognition by Alternately Trained Relaxation Convolutional Neural Network , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[26]  Tianwen Zhang,et al.  Off-line recognition of realistic Chinese handwriting using segmentation-free strategy , 2009, Pattern Recognit..

[27]  Jun Du,et al.  Designing compact classifiers for rotation-free recognition of large vocabulary online handwritten Chinese characters , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Zhen-Long Bai,et al.  A study on the use of 8-directional features for online handwritten Chinese character recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[29]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[30]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[31]  Jérôme Louradour,et al.  Segmentation-free handwritten Chinese text recognition with LSTM-RNN , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[32]  Lianwen Jin,et al.  A Bayesian-based probabilistic model for unconstrained handwritten offline Chinese text line recognition , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[33]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985, Proceedings of the IEEE.