Efficient algorithms and user interaction for metadata extraction from historical maps

In this research project we develop a system for metadata extraction from historical maps. Such maps are valuable sources of information for researchers of various disciplines. However, metadata describing their (geographic) contents are scarce, because such information mostly has to be extracted by hand. Examining related work, we find that no holistic, automated solutions for this problem have been developed so far. Our system aims at speeding up the process of metadata extraction, using efficient algorithms combined with user interaction. We present our approach, which is a pipeline consisting of several steps. The layout of this system is modular; we discuss two modules in the pipeline that we have already developed and sketch our solutions. We conclude this report with an outlook and future work, which includes crowdsourcing applications. This paper presents the state of my PhD project after the first year and is intended for the Junior PhD track. The project is supervised by Thomas C. van Dijk and Alexander Wolff at Würzburg University.

[1]  Bernhard Haslhofer,et al.  Semantically augmented annotations in digitized map collections , 2011, JCDL '11.

[2]  Leif Isaksen,et al.  Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito , 2015 .

[3]  L. Ungar,et al.  Active learning for logistic regression , 2005 .

[4]  Thomas C. van Dijk,et al.  Active Learning for Classifying Template Matches in Historical Maps , 2015, Discovery Science.

[5]  Winfried Höhn,et al.  Detecting Arbitrarily Oriented Text Labels in Early Maps , 2013, IbPRIA.

[6]  Lyle H. Ungar,et al.  Machine Learning manuscript No. (will be inserted by the editor) Active Learning for Logistic Regression: , 2007 .

[7]  Winfried Höhn,et al.  A scalable, distributed and dynamic workflow system for digitization processes , 2013, JCDL '13.

[8]  Winfried Höhn,et al.  Semiautomatic recognition and georeferencing of places in early maps , 2013, JCDL '13.

[9]  Robert Weibel,et al.  Saliency and semantic processing: Extracting forest cover from historical topographic maps , 2006, Pattern Recognit..

[10]  Peter Bajcsy,et al.  Automation of digital historical map analyses , 2011, Electronic Imaging.

[11]  Kimberly C. Kowal,et al.  Georeferencer: Crowdsourced Georeferencing for Map Library Collections , 2012, D Lib Mag..

[12]  Mauricio Giraldo Arteaga Historical map polygon and feature extractor , 2013, MapInteract '13.

[13]  Bernhard Jenny,et al.  Cultural Heritage: Studying cartographic heritage: Analysis and visualization of geometric distortions , 2011 .