Importance sampling based discriminative learning for large scale offline handwritten Chinese character recognition

The development of a discriminative learning framework based on importance sampling for large-scale classification tasks is reported in this paper. The framework involves the assignment of samples with different weights according to the sample importance weight function derived from the Bayesian classification rule. Three methods are used to calculate the sample importance weights for learning the modified quadratic discriminant function (MQDF). (1) Rejection sampling method. The method selects important samples as a training subset and trains different levels of MQDFs by focusing on different types of samples. (2) Boosting algorithm. The algorithm modifies the sample importance weights iteratively according to the recognition performance. (3) Minimum classification error (MCE) rule. The parameter of the importance weight function is estimated using the MCE rule. In general, the cursive samples are usually misclassified or prone to be misclassified by the MQDF learned under the maximum likelihood estimation (MLE) rule. The proposed importance sampling framework thereby makes the MQDF classifier focus more on cursive samples than on normal samples. Such a strategy allows the MQDF to achieve higher accuracy while maintaining lower computational complexity. Comprehensive experiments on three Chinese handwritten character datasets demonstrated that the proposed framework exhibits promising character recognition accuracy. Propose an importance sampling based discriminative learning framework for large scale classification problem.Introduce rejection sampling, boosting algorithm and MCE to estimate sample importance weight.Compare the methods on large scale character recognition problem and summarize them under the unify framework.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Changsong Liu,et al.  MQDF Discriminative Learning Based Offline Handwritten Chinese Character Recognition , 2011, 2011 International Conference on Document Analysis and Recognition.

[3]  Changsong Liu,et al.  An Effective and Practical Classifier Fusion Strategy for Improving Handwritten Character Recognition , 2007 .

[4]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[5]  Meng Shi,et al.  Accuracy improvement of handwritten numeral recognition by mirror image learning , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[6]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[7]  Rui Zhang,et al.  Adaptive confidence transform based classifier combination for Chinese character recognition , 1998, Pattern Recognit. Lett..

[8]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[9]  Hiroshi Sako,et al.  Discriminative learning quadratic discriminant function for handwriting recognition , 2004, IEEE Transactions on Neural Networks.

[10]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[11]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[13]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[14]  Eric C. Anderson Monte Carlo Methods and Importance Sampling , 1999 .

[15]  Fei Yin,et al.  Online and offline handwritten Chinese character recognition: Benchmarking on new databases , 2013, Pattern Recognit..

[16]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[17]  Nei Kato,et al.  A Handwritten Character Recognition System Using Directional Element Feature and Asymmetric Mahalanobis Distance , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Changsong Liu,et al.  A New AdaBoost Algorithm for Large Scale Classification And Its Application to Chinese Handwritten Character Recognition , 2008 .

[21]  Cheng-Lin Liu,et al.  High accuracy handwritten Chinese character recognition using LDA-based compound distances , 2008, Pattern Recognit..

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Xiaoqing Ding,et al.  Handwritten character recognition using gradient feature and quadratic classifier with multiple discrimination schemes , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[24]  Xudong Jiang,et al.  Asymmetric Principal Component and Discriminant Analyses for Pattern Classification , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Venkatesan Guruswami,et al.  Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[26]  Hiroshi Sako,et al.  Handwritten digit recognition: investigation of normalization and feature extraction techniques , 2004, Pattern Recognit..

[27]  Jian-xiong Dong,et al.  An improved handwritten Chinese character recognition system using support vector machine , 2005, Pattern Recognit. Lett..

[28]  Yuguo Chen Another look at rejection sampling through importance sampling , 2005 .

[29]  Lawrence K. Saul,et al.  Maximum likelihood and minimum classification error factor analysis for automatic speech recognition , 2000, IEEE Trans. Speech Audio Process..

[30]  Honggang Zhang,et al.  2009 10th International Conference on Document Analysis and Recognition HCL2000—A Large-scale Handwritten Chinese Character Database for Handwritten Character Recognition , 2022 .

[31]  Edward R. Dougherty,et al.  The peaking phenomenon in the presence of feature-selection , 2008, Pattern Recognit. Lett..

[32]  Tomoyuki Hamamura,et al.  Concurrent Optimization of Context Clustering and GMM for Offline Handwritten Word Recognition Using HMM , 2011, 2011 International Conference on Document Analysis and Recognition.

[33]  Lianwen Jin,et al.  Building compact MQDF classifier for large character set recognition by subspace distribution sharing , 2008, Pattern Recognit..

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[35]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[36]  Yunde Jia,et al.  Geometrical-Statistical Modeling of Character Structures for Natural Stroke Extraction and Matching , 2006 .

[37]  Mu-King Tsay,et al.  Feature Transformation with Generalized Learning Vector Quantization for Hand-Written Chinese Character Recognition , 1999 .

[38]  Robert E. Schapire,et al.  Using output codes to boost multiclass learning problems , 1997, ICML.

[39]  Tetsushi Wakabayashi,et al.  Improvement of handwritten Japanese character recognition using weighted direction code histogram , 1997, Pattern Recognit..

[40]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[41]  Meng Shi,et al.  Handwritten numeral recognition using gradient and curvature of gray scale image , 2002, Pattern Recognit..