Similar Handwritten Chinese Character Discrimination by Weakly Supervised Learning

Traditional approaches for handwritten Chinese character recognition suffer in classifying similar characters. In this paper, we propose to discriminate similar handwritten Chinese characters by using weakly supervised learning. Our approach learns a discriminative SVM for each similar pair which simultaneously localizes the discriminative region of similar character and makes the classification. For the first time, similar handwritten Chinese character recognition (SHCCR) is formulated as an optimization problem extended from SVM. We also propose a novel feature descriptor, Gradient Context, and apply bag-of-words model to represent regions with different scales. In our method, we do not need to select a sized-fixed sub-window to differentiate similar characters. The unconstrained property makes our method well adapted to high variance in the size and position of discriminative regions in similar handwritten Chinese characters. We evaluate our proposed approach over the CASIA Chinese character data set and the results show that our method outperforms the state of the art.

[1]  Carsten Rother,et al.  Weakly supervised discriminative localization and classification: a joint learning process , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Cheng-Lin Liu,et al.  LDA-Based Compound Distance for Handwritten Chinese Character Recognition , 2007 .

[3]  Multiple Instance Learning Based Method for Similar Handwritten Chinese Characters Discrimination , 2011, 2011 International Conference on Document Analysis and Recognition.

[4]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[5]  Venu Govindaraju,et al.  OCR in a hierarchical feature space , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[6]  Dai Ruwei,et al.  Chinese character recognition: history, status and prospects , 2007 .

[7]  Cheng-Lin Liu,et al.  Classifier combination based on confidence transformation , 2005, Pattern Recognit..

[8]  Venu Govindaraju,et al.  OCR in a Hierarchical Feature Space , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Frans C. A. Groen,et al.  The box-cox metric for nearest neighbour classification improvement , 1997, Pattern Recognit..

[10]  Carsten Rother,et al.  Learning discriminative localization from weakly labeled data , 2014, Pattern Recognit..

[11]  Stavros J. Perantonis,et al.  Handwritten character recognition through two-stage foreground sub-sampling , 2010, Pattern Recognit..

[12]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[13]  Ka-Chung Leung,et al.  Recognition of handwritten Chinese characters by critical region analysis , 2010, Pattern Recognit..

[14]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[15]  Cheng-Lin Liu,et al.  High accuracy handwritten Chinese character recognition using LDA-based compound distances , 2008, Pattern Recognit..

[16]  Yan Gao,et al.  Similar handwritten Chinese character recognition by kernel discriminative locality alignment , 2014, Pattern Recognit. Lett..

[17]  Cheng-Lin Liu,et al.  Handwritten Chinese Character Recognition: Effects of Shape Normalization and Feature Extraction , 2006, SACH.

[18]  Hiroshi Sako,et al.  Handwritten Chinese character recognition: alternatives to nonlinear normalization , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[19]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[20]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Bo Xu,et al.  Similar Handwritten Chinese Characters Recognition by Critical Region Selection Based on Average Symmetric Uncertainty , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[22]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[23]  Cheng-Lin Liu,et al.  Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition , 2005, Pattern Recognit..

[24]  F. Perronnin,et al.  Local gradient histogram features for word spotting in unconstrained handwritten documents , 2008 .

[25]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[26]  Dimitri P. Bertsekas,et al.  Convex Analysis and Optimization , 2003 .

[27]  Christoph H. Lampert,et al.  Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Salvatore Tabbone,et al.  Symbol Descriptor Based on Shape Context and Vector Model of Information Retrieval , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[29]  Cheng-Lin Liu,et al.  Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.