Using codebooks of fragmented connected-component contours in forensic and historic writer identification

Recent advances in 'off-line' writer identification allow for new applications in handwritten text retrieval from archives of scanned historical documents. This paper describes new algorithms for forensic or historical writer identification, using the contours of fragmented connected-components in free-style handwriting. The writer is considered to be characterized by a stochastic pattern generator, producing a family of character fragments (fraglets). Using a codebook of such fraglets from an independent training set, the probability distribution of fraglet contours was computed for an independent test set. Results revealed a high sensitivity of the fraglet histogram in identifying individual writers on the basis of a paragraph of text. Large-scale experiments on the optimal size of Kohonen maps of fraglet contours were performed, showing usable classification rates within a non-critical range of Kohonen map dimensions. The proposed automatic approach bridges the gap between image-statistics approaches and purely knowledge-based manual character-based methods.

[1]  Aria Nosratinia,et al.  Wavelet-Based Image Coding: An Overview , 1999 .

[2]  Lambert Schomaker,et al.  Automatic writer identification using connected-component contours and edge-based features of uppercase Western script , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Horst Bunke,et al.  Writer identification using text line based features , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[4]  Louis Vuurpijl,et al.  Writer identification using edge-based directional features , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5]  Jianying Hu,et al.  Writer independent on-line handwriting recognition using an HMM approach , 2000, Pattern Recognit..

[6]  Isabelle Guyon,et al.  WANDA: A generic Framework applied in Forensic Handwriting Analysis and Writer Identification , 2003, HIS.

[7]  Douglas H Ubelaker,et al.  The use of SEM/EDS analysis to distinguish dental and osseus tissue from other materials. , 2002, Journal of forensic sciences.

[8]  Lambert Schomaker,et al.  Sparse-parametric writer identification using heterogeneous feature groups , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[9]  Thierry Paquet,et al.  Information retrieval based writer identification , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[10]  Isabelle Guyon,et al.  UNIPEN project of on-line data exchange and recognizer benchmarks , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[11]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[12]  Robert Sabourin,et al.  An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Lambert Schomaker,et al.  From handwriting analysis to pen-computer applications , 1998 .

[14]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[15]  Lambert Schomaker,et al.  Writer Style from Oriented Edge Fragments , 2003, CAIP.

[16]  Tieniu Tan,et al.  Writer identification based on handwriting , 1998 .

[17]  Lambert Schomaker,et al.  Using stroke- or character-based self-organizing maps in the recognition of on-line, connected cursive script , 1993, Pattern Recognit..

[18]  Louis Vuurpijl,et al.  The WANDA Measurement Tool for Forensic Document Examination , 2003 .

[19]  Thierry Paquet,et al.  Handwriting analysis for writer verification , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[20]  Sung-Hyuk Cha,et al.  Individuality of handwriting. , 2002, Journal of forensic sciences.

[21]  Louis Vuurpijl,et al.  Finding structure in diversity: a hierarchical clustering method for the categorization of allographs in handwriting , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[22]  Louis Vuurpijl,et al.  Architectures for detecting and solving conflicts: two-stage classification and support vector classifiers , 2003, Document Analysis and Recognition.

[23]  Tieniu Tan,et al.  Personal identification based on handwriting , 2000, Pattern Recognit..

[24]  Mario Köppen,et al.  Static Signature Verification Employing a Kosko-Neuro-fuzzy Approach , 2002, AFSS.

[25]  Mario Köppen,et al.  A computer-based system to support forensic studies on handwritten documents , 2001, International Journal on Document Analysis and Recognition.