Off-line cursive handwriting recognition using multiple classifier systems—on the influence of vocabulary, ensemble, and training set size

Unconstrained handwritten text recognition is one of the most difficult problems in the field of pattern recognition. Recently, a number of classifier creation and combination methods, known as ensemble methods, have been proposed in the field of machine learning. They have shown improved recognition performance over single classifiers. In this paper, we examine the influence of the vocabulary size, the number of training samples, and the number of classifiers on the performance of three ensemble methods in the context of cursive handwriting recognition. All experiments were conducted using an off-line handwritten word recognizer based on hidden Markov models (HMMs).

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Robert P. W. Duin,et al.  Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy , 2002, Multiple Classifier Systems.

[3]  Ching Y. Suen,et al.  Combination of multiple classifier decisions for optical character recognition , 1997 .

[4]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[5]  Robert P. W. Duin,et al.  Boosting in Linear Discriminant Analysis , 2000, Multiple Classifier Systems.

[6]  Makoto Kobayashi,et al.  Off-line character recognition using HMM by multiple directional feature extraction and voting with bagging algorithm , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[7]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Robert P. W. Duin,et al.  Bagging and the Random Subspace Method for Redundant Feature Spaces , 2001, Multiple Classifier Systems.

[10]  Amlan Kundu,et al.  HANDWRITTEN WORD RECOGNITION USING HIDDEN MARKOV MODEL , 1997 .

[11]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[12]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[13]  Nikunj C. Oza Boosting with Averaged Weight Vectors , 2003, Multiple Classifier Systems.

[14]  Ching Y. Suen,et al.  Multiple Classifier Combination Methodologies for Different Output Levels , 2000, Multiple Classifier Systems.

[15]  Horst Bunke,et al.  Automatic bankcheck processing , 1997 .

[16]  Ching Y. Suen,et al.  A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  J.-C. Simon,et al.  Off-line cursive word recognition , 1992, Proc. IEEE.

[18]  Terry Windeatt,et al.  Boosted Tree Ensembles for Solving Multiclass Problems , 2002, Multiple Classifier Systems.

[19]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[20]  Robert P. W. Duin,et al.  Experiments with Classifier Combining Rules , 2000, Multiple Classifier Systems.

[21]  Sargur N. Srihari Handwritten Address Interpretation: A Task of Many Pattern Recognition Problems , 2000, Int. J. Pattern Recognit. Artif. Intell..

[22]  Seong-Whan Lee,et al.  Advances in Handwriting Recognition , 1999, Series in Machine Perception and Artificial Intelligence.

[23]  Gyeonghwan Kim,et al.  An architecture for handwritten text recognition systems , 1999, International Journal on Document Analysis and Recognition.

[24]  Horst Bunke,et al.  Handbook of Character Recognition and Document Image Analysis , 1997 .

[25]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[26]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[27]  Bernard F. Buxton,et al.  Performance Degradation in Boosting , 2001, Multiple Classifier Systems.

[28]  William B. Yates,et al.  Engineering Multiversion Neural-Net Systems , 1996, Neural Computation.

[29]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[30]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[31]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.