Improving Offline Handwritten Chinese Character Recognition by Iterative Refinement

We present an iterative refinement module that can be applied to the output feature maps of any existing convolutional neural networks in order to further improve classification accuracy. The proposed module, implemented by an attention-based recurrent neural network, can iteratively use its previous predictions to update attention and thereafter refine current predictions. In this way, the model is able to focus on a sub-region of input images to distinguish visually similar characters (see Figure 1 for an example). We evaluate its effectiveness on handwritten Chinese character recognition (HCCR) task and observe significant performance gain. HCCR task is challenging due to large number of classes and small differences between certain characters. To overcome these difficulties, we further propose a novel convolutional architecture that utilizes both low-level visual cues and high-level structural information. Together with the proposed iterative refinement module, our approach achieves an accuracy of 97.37%, outperforming previous methods that use raw images as input on ICDAR-2013 dataset [1].

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[4]  Cheng-Lin Liu,et al.  Handwritten digit recognition: benchmarking of state-of-the-art techniques , 2003, Pattern Recognit..

[5]  Lianwen Jin,et al.  High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Cheng-Lin Liu,et al.  Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition , 2005, Pattern Recognit..

[9]  Wen-Tsuen Chen,et al.  A hierarchical deformation model for online cursive script recognition , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[10]  Yoshimitsu Komiya,et al.  RAV (reparameterized angle variations) algorithm for online handwriting recognition , 2001, International Journal on Document Analysis and Recognition.

[11]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[12]  Yoshua Bengio,et al.  Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark , 2016, Pattern Recognit..

[13]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[15]  Jürgen Schmidhuber,et al.  Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks , 2007, NIPS.

[16]  Y. J. Liu,et al.  A structural approach to online Chinese character recognition , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[17]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[18]  Yoshua Bengio,et al.  Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[19]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[20]  Fei Yin,et al.  Chinese Handwriting Recognition Contest 2010 , 2010, 2010 Chinese Conference on Pattern Recognition (CCPR).

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[23]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Wen-Tsuen Chen,et al.  A hierarchical deformation model for on-line cursive script recognition , 1994, Pattern Recognit..

[25]  Benjamin Graham,et al.  Spatially-sparse convolutional neural networks , 2014, ArXiv.

[26]  Fei Yin,et al.  CASIA Online and Offline Chinese Handwriting Databases , 2011, 2011 International Conference on Document Analysis and Recognition.

[27]  John Bennett,et al.  The effect of large training set sizes on online Japanese Kanji and English cursive recognizers , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[28]  Fei Yin,et al.  Online and offline handwritten Chinese character recognition: Benchmarking on new databases , 2013, Pattern Recognit..

[29]  Nan Jiang Advances in Chinese As a Second Language: Acquisition and Processing , 2014 .

[30]  Alexander M. Rush,et al.  Sequence-Level Knowledge Distillation , 2016, EMNLP.