Hisdoc 2.0: Toward Computer-assisted Paleography

HisDoc 2.01 is a research project on textual heritage analysis and is funded by the Swiss National Science Foundation (SNSF). It builds on the groundwork of the HisDoc2 project, which concentrated on automated methods for codicological and philological studies. The objective of HisDoc 2.0 is computational paleographical analysis, or more specifically, the analysis of scripts, writing styles, and scribes. While the first project aimed at analyzing simple layouts and the textual content of historical documents, HisDoc 2.0 will be dedicated to complex layouts, including fine-grained textline localization and script analysis. Furthermore, semantic domain knowledge extracted from catalogs available on databases such as e-codices3 or manuscripta mediaevalia4 is incorporated into document image analysis. In HisDoc 2.0, we perform fundamental research to facilitate the development of tools that build on existing expert knowledge and will support scholars from the humanities who are concerned with examining and annotating manuscripts in the future.

[1]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[2]  U. Pal,et al.  Segmentation of Bangla unconstrained handwritten text , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3]  Horst Bunke,et al.  A writer identification and verification system using HMM based recognizers , 2006, Pattern Analysis and Applications.

[4]  Ioannis Pratikakis,et al.  ICDAR 2009 Document Image Binarization Contest (DIBCO 2009) , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[5]  Louis Vuurpijl,et al.  Writer identification through information retrieval: the allograph weight vector , 2008, ICFHR 2008.

[6]  Abderrazak Zahour,et al.  Arabic hand-written text-line extraction , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Réjean Plamondon,et al.  Automatic signature verification and writer identification - the state of the art , 1989, Pattern Recognit..

[8]  Nikos A. Nikolaou,et al.  Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths , 2010, Image Vis. Comput..

[9]  Frank von Hagel Kalliope-Portal: Fachportal für Autographen und Nachlässe , 2004 .

[10]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Jihad El-Sana,et al.  Text line segmentation for gray scale historical document images , 2011, HIP '11.

[12]  Alireza Alaei,et al.  A new scheme for unconstrained handwritten text-line segmentation , 2011, Pattern Recognit..

[13]  Sargur N. Srihari,et al.  A statistical approach to line segmentation in handwritten documents , 2007, Electronic Imaging.

[14]  Vassilis Katsouros,et al.  Handwritten document image segmentation into text lines and words , 2010, Pattern Recognit..

[15]  Ioannis Pratikakis,et al.  H-DIBCO 2010 - Handwritten Document Image Binarization Competition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[16]  Lior Wolf,et al.  Automatic extraction of catalog data from digital images of historical manuscripts , 2013, Lit. Linguistic Comput..

[17]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[18]  Lambert Schomaker,et al.  How much handwritten text is needed for text-independent writer verification and identification , 2008, 2008 19th International Conference on Pattern Recognition.

[19]  Lambert Schomaker,et al.  Advances in Writer Identification and Verification , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[20]  Umapada Pal,et al.  Morphology Based Handwritten Line Segmentation Using Foreground and Background Information , 2008 .

[21]  Venu Govindaraju,et al.  Line separation for complex document images using fuzzy runlength , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[22]  Laurence Likforman-Sulem,et al.  A Hough based algorithm for extracting text lines in handwritten documents , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[23]  Louis Vuurpijl,et al.  Writer identification using edge-based directional features , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[24]  Réjean Plamondon,et al.  Handwritten Signature Verification: New Advancements and Open Issues , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[25]  Braxton Ross,et al.  Paläographie des römischen Altertums und des abendländischen Mittelalters. Bernhard Bischoff , 1982 .

[26]  Louis Vuurpijl,et al.  Automatic Allograph Matching in Forensic Writer Identification , 2007, Int. J. Pattern Recognit. Artif. Intell..

[27]  Angelika Garz,et al.  A Binarization-Free Clustering Approach to Segment Curved Text Lines in Historical Manuscripts , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[28]  Frank Lebourgeois,et al.  Automatic Metadata Retrieval from Ancient Manuscripts , 2004, Document Analysis Systems.

[29]  Thierry Paquet,et al.  A writer identification and verification system , 2005, Pattern Recognit. Lett..

[30]  Ioannis Pratikakis,et al.  ICDAR 2013 Document Image Binarization Contest (DIBCO 2013) , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[31]  Mahmoud Reza Hashemi,et al.  Persian cursive script recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[32]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[33]  G. Louloudisa,et al.  Text line detection in handwritten documents , 2008 .

[34]  Angelika Garz,et al.  Binarization-Free Text Line Segmentation for Historical Documents Based on Interest Point Clustering , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[35]  Réjean Plamondon,et al.  Automatic Signature Verification: The State of the Art - 1989-1993 , 1994, Int. J. Pattern Recognit. Artif. Intell..

[36]  Giuseppe Pirlo,et al.  Automatic Signature Verification: The State of the Art , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[37]  Jihad El-Sana,et al.  Language-Independent Text Lines Extraction Using Seam Carving , 2011, 2011 International Conference on Document Analysis and Recognition.

[38]  R. Manmatha,et al.  A scale space approach for automatically segmenting words from historical handwritten documents , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Lambert Schomaker,et al.  Automatic writer identification using fragmented connected-component contours , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[40]  Tieniu Tan,et al.  Personal identification based on handwriting , 2000, Pattern Recognit..

[41]  Basilios Gatos,et al.  Handwritten Text Line Segmentation by Shredding Text into its Lines , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[42]  Lambert Schomaker,et al.  Text-Independent Writer Identification and Verification Using Textural and Allographic Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  N. Dershowitz,et al.  Automatic Palaeographic Exploration ofGenizah Manuscripts , 2011 .

[44]  John A. Kunze,et al.  Dublin Core Metadata for Resource Discovery , 1998, RFC.

[45]  Horst Bunke,et al.  HisDoc: Historical Document Analysis, Recognition, and Retrieval , 2012, DH.

[46]  Lior Wolf,et al.  Identifying Join Candidates in the Cairo Genizah , 2011, International Journal of Computer Vision.