Minimum classification error training for online handwritten word recognition

We describe an application of the minimum classification error (MCE) training criterion to online unconstrained-style word recognition. The described system uses allograph-HMMs to handle writer variability. The result, on vocabularies of 5k to 10k, shows that MCE training achieves around 17% word error rate reduction when compared to the baseline maximum likelihood system.

[1]  Shigeru Katagiri,et al.  A new formalization of minimum classification error using a Parzen estimate of classification chance , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Hiroshi Maruyama,et al.  Real-time on-line unconstrained handwriting recognition using statistical methods , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Biing-Hwang Juang,et al.  The segmental K-means algorithm for estimating parameters of hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[4]  Alain Biem,et al.  Feature extraction based on minimum classification error/generalized probabilistic descent method , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Christopher M. Bishop,et al.  Neural Network for Pattern Recognition , 1995 .

[6]  Biing-Hwang Juang,et al.  Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method , 1998, Proc. IEEE.

[7]  Masaki Nakagawa,et al.  Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition , 2001, Pattern Recognit..

[8]  Paul D. Gader,et al.  WORD LEVEL DISCRIMINATIVE TRAINING FOR HANDWRITTEN WORD RECOGNITION , 2004 .

[9]  Anil K. Jain,et al.  Writer Adaptation for Online Handwriting Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hermann Ney,et al.  Progress in dynamic programming search for LVCSR , 2000 .

[12]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[13]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[14]  Gerhard Rigoll,et al.  Comparing adaptation techniques for on-line handwriting recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[15]  A. Nadas,et al.  A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood , 1983 .

[16]  Antonio M. Peinado,et al.  An application of minimum classification error to feature space transformations for speech recognition , 1996, Speech Commun..

[17]  Qiang Huo,et al.  High performance Chinese OCR based on Gabor features, discriminative feature extraction and model training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[18]  B. Juang,et al.  A study on minimum error discriminative training for speaker recognition , 1995 .

[19]  Sadaoki Furui,et al.  A study of speaker adaptation based on minimum classification error training , 1995, EUROSPEECH.

[20]  Marc Parizeau,et al.  A Fuzzy-Syntactic Approach to Allograph Modeling for Cursive Script Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  Chin-Hui Lee,et al.  Segmental GPD training of HMM based speech recognizer , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Michael Perrone,et al.  Writer dependent recognition of on-line unconstrained handwriting , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[24]  Biing-Hwang Juang,et al.  An application of discriminative feature extraction to filter-bank-based speech recognition , 2001, IEEE Trans. Speech Audio Process..

[25]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[26]  Shigeru Katagiri,et al.  A novel spotting-based approach to continuous speech recognition: Minimum error classification of keyword-sequences , 1995 .

[27]  Lionel Prevost,et al.  Automatic allograph selection and multiple expert classification for totally unconstrained handwritten character recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[28]  H. Ney,et al.  Model-based MCE bound to the true Bayes' error , 2001, IEEE Signal Processing Letters.

[29]  M. P. Perrone,et al.  K­MEANS CLUSTERING FOR HIDDEN MARKOV MODEL , 2004 .

[30]  Frank K. Soong,et al.  A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.

[31]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[32]  Erkki Oja,et al.  On-line adaptation in recognition of handwritten alphanumeric characters , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[33]  Isabelle Guyon,et al.  Writer-adaptation for on-line handwritten character recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[34]  Alain Biem Optimizing features and models using the minimum classification error criterion , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[35]  Li Deng,et al.  HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features , 1997, IEEE Trans. Speech Audio Process..

[36]  Scott E. Fahlman,et al.  An empirical study of learning speed in back-propagation networks , 1988 .

[37]  Alain Biem Minimum classification error training of hidden Markov models for handwriting recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[38]  Yves Normandin,et al.  Hidden Markov models, maximum mutual information estimation, and the speech recognition problem , 1992 .

[39]  Alain Biem,et al.  Discriminative training of tied mixture density HMMs for online handwritten digit recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[40]  Alain Biem,et al.  Pattern recognition using discriminative feature extraction , 1997, IEEE Trans. Signal Process..

[41]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.