Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition

We present a novel confidence- and margin-based discriminative training approach for model adaptation of a hidden Markov model (HMM)-based handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood (ML) trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer-specific data. Here, discriminative training based on the maximum mutual information (MMI) and minimum phone error (MPE) criteria are used to train writer-independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. The proposed methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IFN/ENIT Arabic handwriting database, where the word error rate is decreased by 33% relative compared to a ML trained baseline system. On the large-vocabulary line recognition task of the IAM English handwriting database, the word error rate is decreased by 25% relative.

[1]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[3]  Tasos Anastasakos,et al.  The use of confidence measures in unsupervised adaptation of speech recognizers , 1998, ICSLP.

[4]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ralf Schlüter,et al.  Investigations on discriminative training criteria , 2000 .

[6]  Geoffrey Zweig,et al.  LATTICE-BASED UNSUPERVISED MLLR FOR SPEAKER ADAPTATION , 2000 .

[7]  Hermann Ney,et al.  Improved MLLR speaker adaptation using confidence measures for conversational speech recognition , 2000, INTERSPEECH.

[8]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[9]  Daniel Povey,et al.  Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Alex Pentland,et al.  Discriminative, generative and imitative learning , 2002 .

[11]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[12]  Yiming Yang,et al.  Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization , 2003, ICML.

[13]  Daniel Povey,et al.  Discriminative training for HMM-based offline handwritten character recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[14]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[15]  Volker Märgner,et al.  Arabic Handwriting Recognition Competition , 2005, ICDAR.

[16]  Gernot A. Fink,et al.  Unsupervised Estimation of Writing Style Models for Improved Unconstrained Off-line Handwriting Recognition , 2006 .

[17]  Alain Biem,et al.  Maximization of mutual information for offline Thai handwriting recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Alain Biem,et al.  Minimum classification error training for online handwriting recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Rohit Prasad,et al.  Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach , 2006, SACH.

[20]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Verónica Romero,et al.  Combination of N-Grams and Stochastic Context-Free Grammars in an Offline Handwritten Recognition System , 2007, IbPRIA.

[22]  H. Ney,et al.  INTERDEPENDENCE OF LANGUAGE MODELS AND DISCRIMINATIVE TRAINING , 2007 .

[23]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Georg Heigold,et al.  On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields , 2007, INTERSPEECH.

[25]  Georg Heigold,et al.  Modified MMI/MPE: a direct evaluation of the margin in speech recognition , 2008, ICML '08.

[26]  Stefan Jäger,et al.  Arabic and Chinese Handwriting Recognition - SACH 2006 Summit College Park, MD, USA, September 27-28, 2006 Selected Papers , 2008, SACH.

[27]  Horst Bunke,et al.  Hidden Markov model-based ensemble methods for offline handwritten text line recognition , 2008, Pattern Recognit..

[28]  Hermann Ney,et al.  White-space models for offline Arabic handwriting recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[29]  Brian Kingsbury,et al.  Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Michiel Bacchiani,et al.  Confidence scores for acoustic model adaptation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  CardOS V4.2B,et al.  How to Convert a Latin Handwriting Recognition System to Arabic , 2008 .

[32]  Volker Märgner,et al.  Improvement of Arabic handwriting recognition systems; combination and/or reject? , 2009, Electronic Imaging.

[33]  J. Schmidhuber,et al.  A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  ICDAR 2009 Arabic Handwriting Recognition Competition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[35]  Hermann Ney,et al.  Improved Modeling in Handwriting Recognition , 2009 .

[36]  Georg Heigold,et al.  Confidence-Based Discriminative Training for Model Adaptation in Offline Arabic Handwriting Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[37]  Hermann Ney,et al.  Writer Adaptive Training and Writing Variant Model Refinement for Offline Arabic Handwriting Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[38]  Volker Märgner,et al.  ICFHR 2010 - Arabic Handwriting Recognition Competition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[39]  Georg Heigold,et al.  Margin-Based Discriminative Training for String Recognition , 2010, IEEE Journal of Selected Topics in Signal Processing.

[40]  Georg Heigold,et al.  A log-linear discriminative modeling framework for speech recognition , 2010 .

[41]  Salvador España Boquera,et al.  Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.