Learning prototypes for online handwritten digits

A writer independent handwriting recognition system must be able to recognize a wide variety of handwriting styles, while attempting to obtain a high degree of accuracy when recognizing data from any one of those styles. As the number of writing styles increases, so does the variability of the data's distribution. We then have an optimization problem: how to best model the data, while keeping the representation as simple as possible? If we can identify N different styles of writing individual characters (referred to as lexemes), these can then be modeled as N relatively simple independent distributions. We describe here a template-based system using a string-matching distance measure for the recognition of online handwriting which takes advantage of lexemes to reduce the number of templates that must be stored. A method of identifying lexemes and lexeme representatives is shown, and experimental results are given for a set of handwritten digits taken from 21 different writers. The use of lexeme representatives reduces classification time by 90.2% while retaining approximately 98% of the recognition accuracy.

[1]  Jerome R. Bellegarda,et al.  On-line handwriting recognition using continuous parameter hidden Markov models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Giovanni Seni,et al.  Large Vocabulary Recognition of On-Line Handwritten Cursive Words , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[4]  Sharath Pankanti,et al.  An identity-authentication system using fingerprints , 1997, Proc. IEEE.

[5]  Anil K. Jain,et al.  Learning Prototypes for On-Line Handwritten Digits , 1998 .