论文信息 - Towards semi-automatic annotation of toponyms on old maps

Towards semi-automatic annotation of toponyms on old maps

Present-day map digitization methods produce data that is semantically opaque; that is to a machine, a digitized map is merely a collection of bits and bytes. The area it depicts, the places it mentions, any text contained within legends or written on its margins remain unknown - unless a human appraises the image and manually adds this information to its metadata. This problem is especially severe in the case of old maps: these are typically handwritten, may contain text in varying orientations and sizes, and can be in a bad condition due to varying levels of deterioration or damage. As a result, searching for the contents of these documents remains challenging, which makes them hard to discover for users, unusable for machine processing and analysis, and thus effectively lost to many forms of public, scientific or commercial utilization. Fully automatic detection and transcription of place names and legends is, likely, not achievable with today's technology. We argue, however, that semi-automated methods can eliminate much of the tedious effort required to annotate map scans entirely by hand. In this paper, we showcase early work on semi-automatic place name annotation. In our experiment, we utilize open source tools to identify potential locations on the map representing toponyms. We present how, in next steps, we aim to extend our experiment by exploiting the spatial layout of identified candidates to deduce possible place names based on existing toponym lists. Ultimately, or goal is to combine this work with a toolset for manual image annotation into a convenient online environment. This will allow curators, researchers, and potentially also the general public “tag” and annotate toponyms on digitized maps rapidly.

Leif Isaksen | Rainer Simon | Peter Pilgerstorfer | Elton Barker

[1] Chew Lim Tan,et al. Text/Graphics Separation in Maps , 2001, GREC.

[2] Serguei Levachkine,et al. Text/Graphics Separation and Recognition in Raster-Scanned Color Cartographic Maps , 2003, GREC.

[3] Bernhard Haslhofer,et al. Annotations, tags and linked data. Metadata enrichment in online map collections through Volunteer-Contributed Information , 2011 .

[4] Craig A. Knoblock,et al. An Approach for Recognizing Text Labels in Raster Maps , 2010, 2010 20th International Conference on Pattern Recognition.

[5] Puchades i Bataller,et al. Les cartes portolanes : la representació medieval d'una mar solcada , 2007 .

[6] Jerod J. Weinman. Toponym Recognition in Historical Maps by Gazetteer Alignment , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[7] Joachim Pouderoux,et al. Toponym Recognition in Scanned Color Topographic Maps , 2007 .

[8] Bernhard Haslhofer,et al. Semantically augmented annotations in digitized map collections , 2011, JCDL '11.