Linking archival data to location: a case study at the UK National Archives

Purpose – The National Archives (TNA) is the UK Government's official archive. It stores and maintains records spanning over a 1,000 years in both physical and digital form. Much of the information held by TNA includes references to place and frequently user queries to TNA's online catalogue involve searches for location. The purpose of this paper is to illustrate how TNA have extracted the geographic references in their historic data to improve access to the archives.Design/methodology/approach – To be able to quickly enhance the existing archival data with geographic information, existing technologies from Natural Language Processing (NLP) and Geographical Information Retrieval (GIR) have been utilised and adapted to historical archives.Findings – Enhancing the archival records with geographic information has enabled TNA to quickly develop a number of case studies highlighting how geographic information can improve access to large‐scale archival collections. The use of existing methods from the GIR doma...

[1]  D. Bainbridge,et al.  How people describe their image information needs: a grounded theory analysis of visual arts queries , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[2]  Fredric C. Gey,et al.  Geographic search : Catalogs, gazetteers, and maps , 2007 .

[3]  Marty Himmelstein Local Search: The Internet Is the Yellow Pages , 2005, Computer.

[4]  Avi Arampatzis,et al.  The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the Internet , 2007, Int. J. Geogr. Inf. Sci..

[5]  Gideon S. Mann,et al.  Bootstrapping toponym classifiers , 2003, HLT-NAACL 2003.

[6]  Edie M. Rasmussen,et al.  Searching for images: The analysis of users' queries for image retrieval in American history , 2003, J. Assoc. Inf. Sci. Technol..

[7]  Malvina Nissim,et al.  Recognising Geographical Entities in Scottish Historical Documents , 2003 .

[8]  Alia I. Abdelmoty,et al.  Building a Geographical Ontology for Intelligent Spatial Search on the Web , 2005, Databases and Applications.

[9]  Ray R. Larson,et al.  Geographic information retrieval and spatial browsing , 1996 .

[10]  Mark Sanderson,et al.  Search words and geography , 2007, GIR '07.

[11]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[12]  José Luis Borbinha,et al.  A geo-temporal Web gazetteer integrating data from multiple sources , 2008, 2008 Third International Conference on Digital Information Management.

[13]  Jennifer Marlow,et al.  Extending Domain-Specific Resources to Enable Semantic Access to Cultural Heritage Data , 2009, J. Digit. Inf..

[14]  Raphaël Troncy,et al.  Interactive Information Access on the Web of Data , 2009 .

[15]  Jochen L. Leidner Toponym resolution in text , 2007 .

[16]  Hsin-Liang Chen,et al.  An analysis of image queries in the field of art history , 2001, J. Assoc. Inf. Sci. Technol..

[17]  Ian N. Gregory,et al.  The Great Britain Historical GIS. , 2005 .

[18]  Krzysztof Janowicz,et al.  The role of ontology in improving gazetteer interaction , 2008, Int. J. Geogr. Inf. Sci..

[19]  Karen Collins Providing subject access to images : A study of user queries , 2009 .

[20]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[21]  Weiqin Chen,et al.  Digital Map Application for Historical Photos , 2010, ICADL.

[22]  Terence R. Smith,et al.  Alexandria Digital Library , 1995, CACM.

[23]  Luis Gravano,et al.  Computing Geographical Scopes of Web Resources , 2000, VLDB.

[24]  Michael F. Goodchild,et al.  Introduction to digital gazetteer research , 2008, Int. J. Geogr. Inf. Sci..

[25]  Michael K. Buckland,et al.  Combining Place, Time, and Topic: The Electronic Cultural Atlas Initiative , 2004, D Lib Mag..

[26]  Dan Wu,et al.  On assigning place names to geography related web pages , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[27]  Amittai Axelrod,et al.  On building a high performance gazetteer database , 2003, HLT-NAACL 2003.

[28]  Qiang Deng,et al.  GeoName: a system for back-transliterating pinyin place names , 2003, HLT-NAACL 2003.

[29]  Alia Amin,et al.  Understanding cultural heritage experts' information seeking needs , 2008, JCDL '08.

[30]  Chris Higgins,et al.  Spatial Data Infrastructures and Digital Libraries: Paths to Convergence , 2004, D Lib Mag..

[31]  Adrian Popescu,et al.  Gazetiki: automatic creation of a geographical gazetteer , 2008, JCDL '08.

[32]  Kevin S. McCurley,et al.  Geospatial mapping and navigation of the web , 2001, WWW '01.

[33]  Stan Matwin,et al.  Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian Conference on AI.

[34]  Wei Vivian Zhang,et al.  Geomodification in Query Rewriting , 2006, GIR.

[35]  Paul D. Clough Extracting metadata for spatially-aware information retrieval on the internet , 2005, GIR '05.

[36]  Michael F. Goodchild The Alexandria Digital Library Project: Review, Assessment, and Prospects , 2004, D Lib Mag..

[37]  Ian Johnson,et al.  From named place to naming event: creating gazetteers for history , 2008, Int. J. Geogr. Inf. Sci..

[38]  Terence R. Smith,et al.  The Alexandria Digital Library architecture , 2000, International Journal on Digital Libraries.

[39]  James R. Curran,et al.  Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[40]  Stefan M. Rüger,et al.  Identifying and grounding descriptions of places , 2006, GIR.

[41]  Stefan Decker,et al.  Accessing Cultural Heritage using the Web of Data , 2009 .

[42]  Gregory R. Crane Georeferencing in Historical Collections , 2004, D Lib Mag..

[43]  José Luis Vicedo González,et al.  Georeferencing: The geographic associations of information , 2007, J. Assoc. Inf. Sci. Technol..

[44]  José Luis Borbinha,et al.  Geographically-aware information retrieval for collections of digitized historical maps , 2007, GIR '07.