Automatic Handwriting Feature Extraction, Analysis and Visualization in the Context of Digital Palaeography

Digital palaeography is an emerging research area which aims to introduce digital image processing techniques into palaeographic analysis for the purpose of providing objective quantitative measurements. This paper explores the use of a fully automated handwriting feature extraction, visualization, and analysis system for digital palaeography which bridges the gap between traditional and digital palaeography in terms of the deployment of feature extraction techniques and handwriting metrics. We propose the application of a set of features, more closely related to conventional palaeographic assesment metrics than those commonly adopted in automatic writer identification. These features are emprically tested on two datasets in order to assess their effectiveness for automatic writer identification and aid attribution of individual handwriting characteristics in historical manuscripts. Finally, we introduce tools to support visualization of the extracted features in a comparative way, showing how they can best be exploited in the implementation of a content-based image retrieval (CBIR) system for digital archiving.

[1]  Raul H. C. Lopes,et al.  A two-dimensional Kolmogorov-Smirnov test , 2009 .

[2]  Arianna Ciula,et al.  Digital palaeography: using the digital representation of medieval script to support palaeographic analysis , 2005 .

[3]  Nicole Vincent,et al.  A Set of Chain Code Based Features for Writer Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  Özgür Ulusoy,et al.  Content-based retrieval of historical Ottoman documents stored as textual images , 2004, IEEE Transactions on Image Processing.

[5]  V. Cappellini,et al.  Guest Editorial Special Issue on Image Processing for Cultural Heritage , 2004 .

[6]  Lambert Schomaker,et al.  Writer identification using directional ink-trace width measurements , 2012, Pattern Recognit..

[7]  Lambert Schomaker,et al.  Using codebooks of fragmented connected-component contours in forensic and historic writer identification , 2007, Pattern Recognit. Lett..

[8]  Adel M. Alimi,et al.  Image analysis for palaeography inspection , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[9]  Michelle P. Brown,et al.  The Book of Cerne: Prayer, Patronage and Power in Ninth-Century England , 1996 .

[10]  Lothar Michel Gerichtliche Schriftvergleichung: Eine Einführung in Grundlagen, Methoden und Praxis , 1982 .

[11]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[12]  Claudio De Stefano,et al.  A Method for Scribe Distinction in Medieval Manuscripts Using Page Layout Features , 2011, ICIAP.

[13]  Peter Stokes Palaeography and Image-Processing: Some Solutions and Problems , 2007 .

[14]  Nicole Vincent,et al.  HOW TO USE FRACTAL DIMENSIONS TO QUALIFY WRITINGS AND WRITERS , 2000 .

[15]  Sargur N. Srihari,et al.  A Survey of Computer Methods in Forensic Document Examination , 2003 .

[16]  Sung-Hyuk Cha,et al.  Individuality of handwriting. , 2002, Journal of forensic sciences.

[17]  Thierry Paquet,et al.  A writer identification and verification system , 2005, Pattern Recognit. Lett..

[18]  Lambert Schomaker,et al.  Text-Independent Writer Identification and Verification Using Textural and Allographic Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Horst Bunke,et al.  A writer identification and verification system using HMM based recognizers , 2006, Pattern Analysis and Applications.

[21]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[22]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[23]  Kilian Q. Weinberger,et al.  Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.