Finding structure in diversity: a hierarchical clustering method for the categorization of allographs in handwriting

The paper introduces a variant of agglomerative hierarchical clustering techniques. The new technique is used for categorizing character shapes (allographs) in large data sets of handwriting into a hierarchical structure. Such a technique may be used as the basis for a systematic naming scheme of character shapes. Problems with existing methods are described and the proposed method is explained. After application of the method to a very large set of characters, separately for all the letters of the alphabet, relevant clusters are identified and given a unique name. Each cluster represents an allograph prototype.

[1]  Lambertus Schomaker,et al.  Writer and writing-style classification in the recognition of online handwriting , 1994 .

[2]  A. D. Gordon,et al.  Interpreting multivariate data , 1982 .

[3]  Vic Barnett,et al.  Interpreting multivariate data , 1982 .

[4]  Louis Vuurpijl,et al.  Coarse writing-style clustering based on simple stroke-related features. , 1996 .

[5]  Réjean Plamondon,et al.  An evaluation of motor models of handwriting , 1989, IEEE Trans. Syst. Man Cybern..

[6]  Lambertus Schomaker Un-supervised learning of prototype allographs in cursive script recognition using invariant handwriting features , 1991 .

[7]  Isabelle Guyon,et al.  UNIPEN project of on-line data exchange and recognizer benchmarks , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[8]  Jean-Pierre Crettez,et al.  A set of handwriting families: style recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.