Segmenting handwritten text using supervised classification techniques

Recent work on extracting features of gaps in handwritten text allows a classification into inter-word and intra-word classes using suitable classification techniques. In this paper, we apply 5 different supervised classification algorithms from the machine learning field on both the original dataset and a dataset with the best features selected using mutual information. The classifiers are compared by employing McNemar's test. We find that SVMs and MLPs outperform the other classifiers and that preprocessing to select features works well.

[1]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Dean Rubine,et al.  Specifying gestures by example , 1991, SIGGRAPH.

[4]  Ching Y. Suen,et al.  The State of the Art in Online Handwriting Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[6]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[7]  Ching,et al.  The State of the Art in On-Line Handwriting Recognition , 2000 .

[8]  Edward K. Blum,et al.  Approximation theory and feedforward networks , 1991, Neural Networks.

[9]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[10]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[11]  David J. Fleet,et al.  Perceptual Organization as a Foundation for Intelfigent Sketch Editing , 2002 .

[12]  Kari Torkkola,et al.  Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..

[13]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[14]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Xiangshi Ren,et al.  Structuralizing Freeform Notes by Implicit Sketch Understanding , 2002 .