End-to-end Chinese text recognition

In this paper, we propose a new method for Chinese text recognition, which comprises two main contributions: First, we create a large Chinese text dataset, including 260 thousand images collected from business card and 390 thousand synthetic images generated by rendering engine. Second, we use these images to train a deep network to perform text recognition, which can recognize more than six thousand kinds of Chinese character accurately. Although our system is composed of different types of neural networks (CNN and LSTM), it is end-to-end trainable. Experiments demonstrate that our system achieve a high recognition accuracy in which synthetic data plays an important role.

[1]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[2]  Chong Wang,et al.  Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.

[3]  Fei Yin,et al.  Handwritten Chinese Text Recognition by Integrating Multiple Contexts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[5]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[6]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[8]  Kai Wang,et al.  Word Spotting in the Wild , 2010, ECCV.

[9]  Andrew Zisserman,et al.  Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[12]  Andrew Zisserman,et al.  Deep Features for Text Spotting , 2014, ECCV.

[13]  Hartmut Neven,et al.  PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Sargur N. Srihari,et al.  Offline Chinese handwriting recognition: an assessment of current technology , 2007, Frontiers of Computer Science in China.

[15]  Wenyu Liu,et al.  Strokelets: A Learned Multi-scale Representation for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.