Ensemble Methods to Improve the Performance of an English Handwritten Text Line Recognizer

This paper describes recent work on ensemble methods for offline handwritten text line recognition. We discuss techniques to build ensembles of recognizers by systematically altering the training data or the system architecture. To combine the results of the ensemble members, we propose to apply ROVER, a voting based framework commonly used in continuous speech recognition. Additionally, we extend this framework with a statistical combination method. The experimental evaluation shows that the proposed ensemble methods have the potential to improve the recognition accuracy compared to a single recognizer.

[1]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Michael C. Fairhurst,et al.  Genetic Algorithms for Multi-classifier System Configuration: A Case Study in Character Recognition , 2001, Multiple Classifier Systems.

[3]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Belur V. Dasarathy,et al.  Decision fusion , 1994 .

[5]  Fuad Rahman,et al.  Multiple expert classification: a new methodology for parallel decision fusion , 2000, International Journal on Document Analysis and Recognition.

[6]  Gerhard Rigoll,et al.  Handwritten Address Recognition Using Hidden Markov Models , 2004, Reading and Learning.

[7]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[8]  Mohamed Cheriet,et al.  StrCombo : combination of string recognizers , 2002 .

[9]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[10]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[11]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Horst Bunke,et al.  Diversity Analysis for Ensembles of Word Sequence Recognisers , 2006, SSPR/SPR.

[13]  Gerhard Rigoll,et al.  Combination of multiple classifiers for handwritten word recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[14]  Horst Bunke,et al.  Ensembles of classifiers for handwritten word recognition , 2003, Document Analysis and Recognition.

[15]  Luiz Eduardo Soares de Oliveira,et al.  Feature selection for ensembles applied to handwriting recognition , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[18]  Horst Bunke,et al.  Automatic bankcheck processing , 1997 .

[19]  Ching Y. Suen,et al.  A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[21]  Paul D. Gader,et al.  Fusion of handwritten word classifiers , 1996, Pattern Recognit. Lett..

[22]  Horst Bunke,et al.  Rejection strategies for offline handwritten text line recognition , 2006, Pattern Recognit. Lett..

[23]  Horst Bunke,et al.  Multiple Handwritten Text Line Recognition Systems Derived from Specific Integration of a Language Model , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[24]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[25]  Horst Bunke,et al.  Use of Positional Information in Sequence Alignment for Multiple Classifier Combination , 2001, Multiple Classifier Systems.

[26]  Terry Windeatt,et al.  Diversity measures for multiple classifier system analysis and design , 2004, Inf. Fusion.

[27]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[28]  Samy Bengio,et al.  Offline recognition of unconstrained handwritten texts using HMMs and statistical language models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Horst Bunke,et al.  Multiple Classifier Methods for Offline Handwritten Text Line Recognition , 2007, MCS.

[30]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[31]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[32]  Seong-Whan Lee,et al.  Advances in Handwriting Recognition , 1999, Series in Machine Perception and Artificial Intelligence.

[33]  Emmanuel Augustin,et al.  Industrial bank check processing: the A2iA CheckReaderTM , 2001, International Journal on Document Analysis and Recognition.

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[35]  Markus Junker,et al.  Reading and Learning , 2004, Lecture Notes in Computer Science.

[36]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[37]  Alexander H. Waibel,et al.  Recognition of conversational telephone speech using the JANUS speech engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  Sargur N. Srihari,et al.  Parsing and recognition of city, state, and ZIP codes in handwritten addresses , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[39]  Horst Bunke,et al.  Ensemble methods for handwritten text line recognition systems , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[40]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[41]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[42]  Jean-Cédric Chappelier,et al.  Offline grammar-based recognition of handwritten sentences , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.