Joint Spatial and Radical Analysis Network For Distorted Chinese Character Recognition

Recently, a novel radical analysis network (RAN) has been proposed for Chinese character recognition (CCR). The key idea is treating a Chinese character as a composition of radicals rather than a single character class. Compared with traditional learning ways, two serious issues in CCR, i.e., enormous categories and limited training data, can be effectively alleviated. In this paper, we further excavate the potential capability of RAN. First, we validate RAN can reduce the equivariant requirement of regular convolutional neural network (CNN) owing to finer modeling and a local-to-global recognition process, especially considering the rotation transformation. This modeling approach of RAN can be regarded as one instance of compositional models. Second, we propose a joint spatial and radical analysis network (JSRAN) to handle more general situation in which the test data includes kinds of affine transformations. No matter for rotated printed Chinese character or natural scene, JSRAN can outperform RAN and traditional CNN. Finally, according to visualization analysis, we empirically explain why JSRAN can yield a remarkable improvement.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Lianwen Jin,et al.  Multi-font printed Chinese character recognition using multi-pooling convolutional neural network , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[3]  Yoshua Bengio,et al.  Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark , 2016, Pattern Recognit..

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Fei Yin,et al.  Radical-Based Chinese Character Recognition via Multi-Labeled Learning of Deep Residual Networks , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[6]  Jun Du,et al.  Radical Analysis Network for Zero-Shot Learning in Printed Chinese Character Recognition , 2017, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[7]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[9]  Shiliang Zhang,et al.  Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition , 2017, Pattern Recognit..

[10]  Jun Du,et al.  DenseRAN for Offline Handwritten Chinese Character Recognition , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[11]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[12]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[13]  Jun Du,et al.  Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition , 2019, IEEE Transactions on Multimedia.

[14]  Cheng-Lin Liu,et al.  A new radical-based approach to online handwritten Chinese character recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[15]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[16]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[17]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[18]  Zhe Zhu,et al.  Chinese Text in the Wild , 2018, ArXiv.

[19]  Chenxi Liu,et al.  Deep Nets: What have They Ever Done for Vision? , 2018, International Journal of Computer Vision.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[23]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.