Zero-shot Handwritten Chinese Character Recognition with hierarchical decomposition embedding

Abstract Handwritten Chinese Character Recognition (HCCR) is a challenging topic in the field of pattern recognition due to large-scale character vocabulary, complex hierarchical structure, various writing styles, and scarce training samples. In this paper, we explored the hierarchical knowledge of Chinese characters and presented a novel zero-shot HCCR method. First, we handled the relations between the characters and their primitives, such as radicals and structures, to obtain a tree layout of primitives. Then, we presented a novel zero-shot hierarchical decomposition embedding method to encode the tree layout into a semantic vector. Next, we devised a Convolutional Neural Network (CNN) based framework to learn both radicals and structures of characters via the semantic vector. As different Chinese characters share some common radicals and structures, our method is able to recognize new categories without any labeled samples from them. Moreover, our method is effective in both traditional HCCR and zero-shot HCCR tasks. It achieves competitive performance on the traditional experiment setting and significantly surpasses the state-of-the-art methods on the zero-shot experiment setting.

[1]  Lianwen Jin,et al.  Radical aggregation network for few-shot offline handwritten Chinese character recognition , 2019, Pattern Recognit. Lett..

[2]  Jun Du,et al.  Radical analysis network for learning hierarchies of Chinese characters , 2020, Pattern Recognit..

[3]  Fei Yin,et al.  Radical-Based Chinese Character Recognition via Multi-Labeled Learning of Deep Residual Networks , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[4]  Zihan Zhou,et al.  Improving Offline Handwritten Chinese Character Recognition by Iterative Refinement , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Kuo-Chin Fan,et al.  Optical recognition of handwritten Chinese characters by hierarchical radical matching method , 2001, Pattern Recognit..

[7]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Xiaobo Jin,et al.  Attentive Region Embedding Network for Zero-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Korris Fu-Lai Chung,et al.  Offline handwritten Chinese character recognition via radical extraction and recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[10]  Lianwen Jin,et al.  Multi-font printed Chinese character recognition using multi-pooling convolutional neural network , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[11]  Chunyan Miao,et al.  A Survey of Zero-Shot Learning , 2019, ACM Trans. Intell. Syst. Technol..

[12]  Anton van den Hengel,et al.  Wider or Deeper: Revisiting the ResNet Model for Visual Recognition , 2016, Pattern Recognit..

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Trevor Darrell,et al.  Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Fei Yin,et al.  CASIA Online and Offline Chinese Handwriting Databases , 2011, 2011 International Conference on Document Analysis and Recognition.

[16]  Yoshua Bengio,et al.  Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark , 2016, Pattern Recognit..

[17]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[18]  Shih-Fu Chang,et al.  Designing Category-Level Attributes for Discriminative Visual Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Zhe Zhu,et al.  A Large Chinese Text Dataset in the Wild , 2019, Journal of Computer Science and Technology.

[20]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[21]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[22]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jun Du,et al.  DenseRAN for Offline Handwritten Chinese Character Recognition , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[24]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[25]  Jason Weston,et al.  Large scale image annotation: learning to rank with joint word-image embeddings , 2010, Machine Learning.

[26]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[27]  Cheng-Lin Liu,et al.  Fully Convolutional Network Based Skeletonization for Handwritten Chinese Characters , 2018, AAAI.

[28]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[29]  Masaki Nakagawa,et al.  Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition , 2001, Pattern Recognit..

[30]  Satoshi Naoi,et al.  Beyond human recognition: A CNN-based framework for handwritten character recognition , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[31]  Adnan Amin,et al.  Learning to Recognize Hand-Printed Chinese Charaters Using Inductive Logic Programming , 1996, Int. J. Pattern Recognit. Artif. Intell..

[32]  Changshui Zhang,et al.  Attribute-Based Synthetic Network (ABS-Net): Learning more from pseudo feature representations , 2018, Pattern Recognit..

[33]  Cordelia Schmid,et al.  Label-Embedding for Image Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  François Fouss,et al.  The Principal Components Analysis of a Graph, and Its Relationships to Spectral Clustering , 2004, ECML.

[35]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[36]  Jianshu Zhang,et al.  Joint Spatial and Radical Analysis Network For Distorted Chinese Character Recognition , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).

[37]  Lianwen Jin,et al.  High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).