Scribe Attribution for Early Medieval Handwriting by Means of Letter Extraction and Classification and a Voting Procedure for Larger Pieces

The present study investigates a method for the attribution of scribal hands, inspired by traditional palaeography in being based on comparison of letter shapes. The system was developed for and evaluated on early medieval Caroline minuscule manuscripts. The generation of a prediction for a page image involves writing identification, letter segmentation, and letter classification. The system then uses the letter proposals to predict the scribal hand behind a page. Letters and sequences of connected letters are identified by means of connected component labeling and split into letter-size pieces. The hand (and character) prediction makes use of a dataset containing instances of the letters b, d, p, and q, cut out from manuscript pages whose scribal origin is known. Letters are represented by features capturing the distribution of foreground. Cosine similarity is used for nearest neighbor classification. The hand behind a page is finally predicted by means of a voting procedure taking the highest scoring letter-level hits as its input. This hand prediction method was evaluated on pages from five different hands and reached an accuracy above 99% for four of them and 87% for a fifth significantly more difficult one. The hand behind single top listed letters was correctly predicted in 83% of the cases.

[1]  Lambert Schomaker,et al.  Writer identification using directional ink-trace width measurements , 2012, Pattern Recognit..

[2]  Arianna Ciula The Palaeographical Method under the Light of a Digital Approach , 2009 .

[3]  David S. Doermann,et al.  Writer Identification Using an Alphabet of Contour Gradient Descriptors , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[4]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[5]  George W. Furnas,et al.  Pictures of relevance: A geometric analysis of similarity measures , 1987, J. Am. Soc. Inf. Sci..

[6]  Lambert Schomaker,et al.  Text-Independent Writer Identification and Verification Using Textural and Allographic Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Peter Stokes,et al.  Computer-Aided Palaeography, Present and Future , 2009 .

[8]  Youbao Tang,et al.  Offline text-independent writer identification using stroke fragment and contour based features , 2013, 2013 International Conference on Biometrics (ICB).

[9]  Alicia Fornés,et al.  Transcription alignment of Latin manuscripts using hidden Markov models , 2011, HIP '11.

[10]  Lambert Schomaker,et al.  Automatic writer identification using fragmented connected-component contours , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[11]  Claudio De Stefano,et al.  A Method for Scribe Distinction in Medieval Manuscripts Using Page Layout Features , 2011, ICIAP.