From text to landscape: locating, identifying and mapping the use of landscape features in a Swiss Alpine corpus

In this paper, we demonstrate how a large corpus, consisting of about 10 000 articles describing Swiss alpine landscapes and activities and dating back to 1864, can be used to explore the use of language in space. In a first step, we link landscape descriptions to geospatial footprints, which requires new methods to disambiguating toponyms referring to natural features. Secondly, we identify natural features used to describe landscapes, which are compared and discussed in the light of previous work based on controlled participant experiments in laboratory settings and more exploratory ethnographic studies. Finally, we use natural features in combination with geospatial footprints to investigate variations in landscape descriptions across space. Our contributions are threefold. Firstly, we show how a corpus composed of detailed descriptions of natural landscapes can be georeferenced and mapped using density surfaces and an adaptive grid linking footprints to articles. Secondly, 95 natural features are identified in the corpus, forming a vocabulary of terms reflecting known basic levels and their relationships to other more specific landscape features. Thirdly, we can explore the use of natural features in broader spatial and temporal contexts than is possible in typical ethnographic work, by exploring when and where particular terms are used within Switzerland with respect to our corpus. On the one hand, this enables us to characterize individual regions and, on the other hand, to measure similarity between regions, on the basis of associated natural features. Our methods could be adapted to different types of corpus, for instance, referring to fine granularity entities in urban landscapes. Our results are potential building blocks for attaching place-related descriptions to automatically generated sensor data such as photographs or satellite images.

[1]  Bruno Martins,et al.  A Machine Learning Approach for Resolving Place References in Text , 2010, AGILE Conf..

[2]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[3]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[4]  André Skupin,et al.  An alternative map of the United States based on an n-dimensional model of geographic space , 2011, J. Vis. Lang. Comput..

[5]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[6]  Mor Naaman,et al.  Methods for extracting place semantics from Flickr tags , 2009, TWEB.

[7]  David M. Mark,et al.  Landscape Categories in Yindjibarndi: Ontology, Environment, and Language , 2003, COSIT.

[8]  Barry Smith,et al.  Ontology and Geographic Kinds , 1998 .

[9]  Mark Sanderson,et al.  Spatio-textual Indexing for Geographical Search on the Web , 2005, SSTD.

[10]  Clare Davies,et al.  User Needs and Implications for Modelling Vague Named Places , 2009, Spatial Cogn. Comput..

[11]  Barry Smith,et al.  Do Mountains Exist? Towards an Ontology of Landforms , 2003 .

[12]  Katherine A. Rawson,et al.  Category Norms: An Updated and Expanded Version of the Battig and Montague (1969) Norms. , 2004 .

[13]  Christopher B. Jones,et al.  Geographical information retrieval , 2008, Int. J. Geogr. Inf. Sci..

[14]  K. Nelson,et al.  Nouns in early lexicons: evidence, explanations and implications , 1993, Journal of Child Language.

[15]  Paul D. Clough Extracting metadata for spatially-aware information retrieval on the internet , 2005, GIR '05.

[16]  Joseph. Wood,et al.  The geomorphological characterisation of Digital Elevation Models , 1996 .

[17]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[18]  Zev Naveh,et al.  LANDSCAPE ECOLOGY: THEORY AND APPLICATION , 1983, Landscape Journal.

[19]  Martin Volk,et al.  Classifying Named Entities in an Alpine Heritage Corpus , 2009, Künstliche Intell..

[20]  Davide Buscaldi,et al.  Approaches to disambiguating toponyms , 2011, SIGSPACIAL.

[21]  Gary King,et al.  A Method of Automated Nonparametric Content Analysis for Social Science , 2010 .

[22]  Nate Blaylock,et al.  TESLA: A Tool for Annotating Geospatial Language Corpora , 2009, HLT-NAACL.

[23]  MAX J. EGENHOFER,et al.  Point Set Topological Relations , 1991, Int. J. Geogr. Inf. Sci..

[24]  Ian N. Gregory,et al.  Mapping the English Lake District: a literary GIS , 2011 .

[25]  Jochen L. Leidner Toponym Resolution in Text: “Which Sheffield is it?” , 2004 .

[26]  Keith C. Clarke,et al.  Interactive Visual Exploration of a Large Spatio-temporal Dataset: Reflections on a Geovisualization Mashup. , 2007, IEEE Transactions on Visualization and Computer Graphics.

[27]  Christian Chiarcos,et al.  A New Hybrid Dependency Parser for German , 2009 .

[28]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[29]  Maike Buchin,et al.  Segmenting trajectories: A framework and algorithms using spatiotemporal criteria , 2011, J. Spatial Inf. Sci..

[30]  David Stea,et al.  Ethnophysiography of arid lands: Categories for landscape features , 2010 .

[31]  M. Egenhofer,et al.  Point-Set Topological Spatial Relations , 2001 .

[32]  José Luis Vicedo González,et al.  Georeferencing: The geographic associations of information , 2007, J. Assoc. Inf. Sci. Technol..

[33]  H. R. Miller,et al.  The Data Avalanche is Here: Shouldn’t We Be Digging? , 2010 .

[34]  Ray R. Larson,et al.  A comparison of geometric approaches to assessing spatial similarity for GIR , 2008, Int. J. Geogr. Inf. Sci..

[35]  David M. Mark,et al.  Landscape in language , 2011 .

[36]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[37]  Claire Beesley,et al.  Ground truth: The social implications of geographic information systems , 1996 .

[38]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[39]  Gennady L. Andrienko,et al.  Extracting Events from Spatial Time Series , 2010, 2010 14th International Conference Information Visualisation.

[40]  Ross S. Purves,et al.  Resolving fine granularity toponyms: Evaluation of a disambiguation approach , 2012, GIScience 2012.

[41]  Jochen L. Leidner Toponym resolution in text (abstract only): "which sheffield is it?" , 2004, SIGIR '04.

[42]  Humphrey Southall,et al.  On historical gazetteers , 2011, Int. J. Humanit. Arts Comput..

[43]  Barbara Piatti,et al.  Die Geographie der Literatur : Schauplätze, Handlungsräume, Raumphantasien , 2008 .

[44]  G. Zipf The Psycho-Biology Of Language: AN INTRODUCTION TO DYNAMIC PHILOLOGY , 1999 .

[45]  David M. Mark,et al.  Geographical categories: an ontological investigation , 2001, Int. J. Geogr. Inf. Sci..

[46]  David M. Berry,et al.  Understanding digital humanities , 2012 .

[47]  Xiao Zhang,et al.  GeoCAM: A geovisual analytics workspace to contextualize and interpret statements about movement , 2011, J. Spatial Inf. Sci..

[48]  William J. Turkel,et al.  Interchange: The promise of digital history , 2008 .

[49]  R. J. Pike,et al.  Automated classifications of topography from DEMs by an unsupervised nested-means algorithm and a three-part geometric signature , 2007 .

[50]  Loet Leydesdorff,et al.  The relation between Pearson's correlation coefficient r and Salton's cosine measure , 2009, ArXiv.

[51]  N. J. Enfield,et al.  Landscape terms and place names elicitation guide , 2004 .

[52]  David M. Mark,et al.  Naive Geography , 1995, COSIT.

[53]  W. Montague,et al.  Category norms of verbal items in 56 categories A replication and extension of the Connecticut category norms , 1969 .

[54]  Stephen C. Levinson,et al.  Language and landscape: a cross-linguistic perspective , 2008 .

[55]  Hideo Joho,et al.  Deliverable type: Contributing WP: , 2022 .

[56]  Jochen L. Leidner Toponym resolution in text: annotation, evaluation and applications of spatial grounding , 2007, SIGF.

[57]  Martin Tomko,et al.  User evaluation of automatically generated keywords and toponyms for geo-referenced images , 2013, J. Assoc. Inf. Sci. Technol..

[58]  Stefan M. Rüger,et al.  Using co‐occurrence models for placename disambiguation , 2008, Int. J. Geogr. Inf. Sci..

[59]  B. Tversky,et al.  Categories of environmental scenes , 1983, Cognitive Psychology.