Geographic information from georeferenced social media data

In the last few years, we are witnessing an emerging class of communication and information platforms some call social awareness streams (SAS) [15]. Available from social media services such as Facebook, Twitter, FourSquare, Flickr, and others, these hugely popular platforms allow participants to post streams of lightweight content items, from short status messages to links, pictures, and videos, in a highly connected social environment. Many of these items are associated with location coordinates in the form of latitude and longitude, or with a business or venue that is in turn associated with a precise location. The number of "geotagged" items is likely to grow with the number of people using geo-enabled devices to access and produce SAS data. The vast amounts of SAS data offer unique opportunities for understanding local communities and people's attitudes, attention, and interest in them. Robust methods for learning from SAS data about geographies and local communities, using methods from Artificial Intelligence, Information Retrieval and Natural Language Processing, can greatly improve the state of geographic information retrieval. Such contributions, from better modelling of geographic areas, to improved knowledge about these areas and how they are used by individuals and communities, have begun to surface in the last few years, and are summarized in this article. The structure of this work borrows from Lynch [13], who referred to five elements that make up an individual's perception of city: districts, landmarks, paths, nodes, and edges. This article borrows from Lynch in the context of social media and SAS, proposing the four elements that make up the geographic information that can be derived from social media about a city or area: districts, landmarks, paths, and activities. This article presents a simple model for geographic SAS data, and then considers the four social media elements, or main types of applications of SAS data to geographic information systems. These applications include boundary definition and detection (district); computation of attractions (landmarks); derivation and recommendation of paths; and evaluation of activities, interests and temporal trends.

[1]  Danah Boyd,et al.  Tweeting from the Town Square: Measuring Geographic Local Networks , 2010, ICWSM.

[2]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[3]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[4]  Mor Naaman,et al.  World explorer: visualizing aggregate data from unstructured text in geo-referenced collections , 2007, JCDL '07.

[5]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[6]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[7]  Ross Purves,et al.  Exploring place through user-generated content: Using Flickr tags to describe city cores , 2010, J. Spatial Inf. Sci..

[8]  Darren Gergle,et al.  On the "localness" of user-generated content , 2010, CSCW '10.

[9]  Amit P. Sheth,et al.  Spatio-Temporal-Thematic Analysis of Citizen Sensor Data: Challenges and Experiences , 2009, WISE.

[10]  Martine De Cock,et al.  Neighborhood restrictions in geographic IR , 2007, SIGIR.

[11]  J. Jacobs The Death and Life of Great American Cities , 1962 .

[12]  Mor Naaman,et al.  Is it really about me?: message content in social awareness streams , 2010, CSCW '10.

[13]  Hila Becker,et al.  Hip and trendy: Characterizing emerging trends on Twitter , 2011, J. Assoc. Inf. Sci. Technol..

[14]  Cong Yu,et al.  Automatic construction of travel itineraries using social breadcrumbs , 2010, HT '10.

[15]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[16]  Barry Wellman,et al.  Geography of Twitter networks , 2012, Soc. Networks.

[17]  B. Taper Publications of the Joint Center for Urban Studies , 1970 .

[18]  Leysia Palen,et al.  Chatter on the red: what hazards threat reveals about the social life of microblogged information , 2010, CSCW '10.

[19]  Eric Gilbert,et al.  The network in the garden: an empirical analysis of social media in rural life , 2008, CHI.

[20]  Cecilia Mascolo,et al.  Distance Matters: Geo-social Metrics for Online Social Networks , 2010, WOSN.

[21]  Rein Ahas,et al.  Mobile Positioning Data in Tourism Studies and Monitoring: Case Study in Tartu, Estonia , 2007, ENTER.

[22]  Josep Blat,et al.  Digital Footprinting: Uncovering Tourists with User-Generated Content , 2008, IEEE Pervasive Computing.

[23]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.