Design Compact Recognizers of Handwritten Chinese Characters Using Precision Constrained Gaussian Models, Minimum Classification Error Training and Parameter Compression

In our previous work, a precision constrained Gaussian model (PCGM) was proposed for character modeling to design compact recognizers of handwritten Chinese characters. A maximum likelihood training procedure was developed to estimate model parameters from training data. In this paper, we extend the above work by using minimum classification error (MCE) training to improve recognition accuracy and split vector quantization technique to compress model parameters. Compared with the state-of-the-art MCE-trained and compressed classifiers based on modified quadratic discriminant function, PCGM-based classifiers can achieve much better memory-accuracy tradeoff, therefore offer a good solution to designing compact handwriting recognition systems for East Asian languages such as Chinese, Japanese, and Korean.

[1]  Masaki Nakagawa,et al.  Collection of on-line handwritten Japanese character pattern databases and their analyses , 2004, Document Analysis and Recognition.

[2]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[3]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Scott Axelrod,et al.  Modeling with a subspace constraint on inverse covariance matrices , 2002, INTERSPEECH.

[5]  Yong Ge,et al.  A study on the use of CDHMM for large vocabulary off-line recognition of handwritten Chinese characters , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[6]  Zhen-Long Bai,et al.  A Study of Nonlinear Shape Normalization for Online Handwritten Chinese Character Recognition: Dot Density vs. Line Density Equalization , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[7]  Zhen-Long Bai,et al.  A study on the use of 8-directional features for online handwritten Chinese character recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[8]  Hiroshi Sako,et al.  Discriminative learning quadratic discriminant function for handwriting recognition , 2004, IEEE Transactions on Neural Networks.

[9]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[10]  Yongqiang Wang,et al.  Modeling inverse covariance matrices by expansion of tied basis matrices for online handwritten Chinese character recognition , 2009, Pattern Recognit..

[11]  Peder A. Olsen,et al.  Modeling inverse covariance matrices by basis expansion , 2004, IEEE Trans. Speech Audio Process..

[12]  Ananth Sankar,et al.  Mixtures of inverse covariances , 2004, IEEE Trans. Speech Audio Process..

[13]  Mark J. F. Gales,et al.  Basis superposition precision matrix modelling for large vocabulary continuous speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[15]  Cheng-Lin Liu,et al.  Online Japanese Character Recognition Using Trajectory-Based Normalization and Direction Feature Extraction , 2006 .

[16]  Qiang Huo,et al.  High performance Chinese OCR based on Gabor features, discriminative feature extraction and model training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[17]  Yongqiang Wang,et al.  A study of semi-tied covariance modeling for online handwritten Chinese character recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[18]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[19]  Qiang Huo,et al.  A comparative study of several modeling approaches for large vocabulary offline recognition of handwritten Chinese characters , 2002, Object recognition supported by user interaction for service robots.

[20]  Scott Axelrod,et al.  Subspace constrained Gaussian mixture models for speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[21]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Lianwen Jin,et al.  Building compact MQDF classifier for large character set recognition by subspace distribution sharing , 2008, Pattern Recognit..

[23]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..