Description and characterization of place properties using topic modeling on georeferenced tags

ABSTRACT User-Generated Content (UGC) provides a potential data source which can help us to better describe and understand how places are conceptualized, and in turn better represent the places in Geographic Information Science (GIScience). In this article, we aim at aggregating the shared meanings associated with places and linking these to a conceptual model of place. Our focus is on the metadata of Flickr images, in the form of locations and tags. We use topic modeling to identify regions associated with shared meanings. We choose a grid approach and generate topics associated with one or more cells using Latent Dirichlet Allocation. We analyze the sensitivity of our results to both grid resolution and the chosen number of topics using a range of measures including corpus distance and the coherence value. Using a resolution of 500 m and with 40 topics, we are able to generate meaningful topics which characterize places in London based on 954 unique tags associated with around 300,000 images and more than 7000 individuals.

[1]  L. Tiina Sarjakoski,et al.  Need for Context-Aware Topographic Maps in Mobile Devices , 2003, ScanGIS.

[2]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[4]  Dirk Burghardt,et al.  Mapping Space-Related Emotions out of User-Generated Photo Metadata Considering Grammatical Issues , 2016 .

[5]  H. Miller Tobler's First Law and Spatial Analysis , 2004 .

[6]  Deborah G. Tatar,et al.  Places: People, Events, Loci – the Relation of Semantic Frames in the Construction of Place , 2008, Computer Supported Cooperative Work (CSCW).

[7]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[8]  J. Agnew,et al.  Sage Handbook of Geographical Knowledge , 2011 .

[9]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[10]  L. Vinet,et al.  A ‘missing’ family of classical orthogonal polynomials , 2010, 1011.1669.

[11]  Claude Grasland,et al.  Modifiable Area Unit Problem , 2006 .

[12]  Ari Pirkola Extracting variant forms of chemical names for information retrieval , 2008, Inf. Res..

[13]  Alan M. MacEachren,et al.  Leveraging Big (Geo) Data with (Geo) Visual Analytics: Place as the Next Frontier , 2017 .

[14]  D. Massey Power-geometry and a progressive sense of place , 2012 .

[15]  Clare Davies,et al.  Reading Geography between the Lines: Extracting Local Place Knowledge from Text , 2013, COSIT.

[16]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[17]  Alexander Dunkel,et al.  Visualizing the perceived environment using crowdsourced photo geodata , 2015 .

[18]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[19]  Arzu Çöltekin,et al.  Towards (Re)Constructing Narratives from Georeferenced Photographs through Visual Analytics , 2014 .

[20]  Matthew Smith,et al.  Big data privacy issues in public social media , 2012, 2012 6th IEEE International Conference on Digital Ecosystems and Technologies (DEST).

[21]  Benjamin Adams,et al.  Inferring Thematic Places from Spatially Referenced Natural Language Descriptions , 2013 .

[22]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[23]  Ross Purves,et al.  Twitter location (sometimes) matters: Exploring the relationship between georeferenced tweet content and nearby feature classes , 2014, J. Spatial Inf. Sci..

[24]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[25]  A. Stefanidis,et al.  Crowdsourcing a Collective Sense of Place , 2016, PloS one.

[26]  Haosheng Huang,et al.  Context-Aware Location Recommendation Using Geotagged Photos in Social Media , 2016, ISPRS Int. J. Geo Inf..

[27]  Jo Wood,et al.  Describing place through user generated content , 2011, First Monday.

[28]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[29]  David Buttler,et al.  Exploring Topic Coherence over Many Models and Many Topics , 2012, EMNLP.

[30]  Ross Purves,et al.  Exploring place through user-generated content: Using Flickr tags to describe city cores , 2010, J. Spatial Inf. Sci..

[31]  Paul A. Longley,et al.  The geography of Twitter topics in London , 2016, Comput. Environ. Urban Syst..

[32]  Daniel Barbará,et al.  Topic Significance Ranking of LDA Generative Models , 2009, ECML/PKDD.

[33]  Michael F. Goodchild,et al.  Where's Downtown?: Behavioral Methods for Determining Referents of Vague Spatial Queries , 2003 .

[34]  Matthew Zook,et al.  Social Media and the City: Rethinking Urban Socio-Spatial Inequality Using User-Generated Geographic Information , 2015 .

[35]  Andrew Scheil,et al.  Space and Place , 2012 .

[36]  D. Dorling Concepts and Techniques in Modern Geography series , 1996 .

[37]  Sheng Tang,et al.  A density-based method for adaptive LDA model selection , 2009, Neurocomputing.

[38]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[39]  Timothy Baldwin,et al.  Evaluating topic representations for exploring document collections , 2015, J. Assoc. Inf. Sci. Technol..

[40]  Martin Tomko,et al.  From Descriptions to Depictions: A Conceptual Framework , 2013, COSIT.

[41]  Jennifer Marlow,et al.  Flickr: a first look at user behaviour in the context of photography as serious leisure , 2008, Inf. Res..

[42]  C. Capineri Kilburn High Road Revisited , 2016 .

[43]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[44]  Juliane Jung,et al.  The Practice Of Everyday Life , 2016 .

[45]  B. Langer,et al.  The Practice of Everyday Life , 2019, Forms of Thinking in Leopardi’s Zibaldone.