Using Large Scale Aggregated Knowledge for Social Media Location Discovery

Geospatial analysis of location-enabled social media networks can be utilized to generate vital insights in areas where situational awareness is important, such as disaster prevention and crisis response. However, several recent approaches struggle under the challenge that only a small fraction of the data is actually provided with precise geo-tags or even GPS information of their origin. In this work we introduce two strategies that are suitable to assign probable locations of origin to social media messages of unknown locations. They are based on aggregated knowledge about the author and/or the textual content of the message. Using our prototype implementation and a collected dataset comprising more than one year of geolocated Twitter data, we evaluate the effectiveness of our strategies. Our results show that we can locate up to 74% of all messages that were written in specific cities and about 20% of messages written in specific districts.

[1]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[2]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[3]  Jason Baldridge,et al.  Simple supervised document geolocation with geodesic grids , 2011, ACL.

[4]  B. Weitz Hosted By , 2003 .

[5]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[6]  Lisl Zach,et al.  Microblogging for crisis communication: Examination of twitter use in response to a 2009 violent crisis in the Seattle-Tacoma, Washington area , 2010, ISCRAM.

[7]  Thomas Ertl,et al.  Inverse Document Density: A Smooth Measure for Location-Dependent Term Irregularities , 2012, COLING.

[8]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[9]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[10]  Xiao Zhang,et al.  SensePlace2: GeoTwitter analytics support for situational awareness , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[11]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[12]  Lee Westover,et al.  Splatting: a parallel, feed-forward volume rendering algorithm , 1991 .

[13]  Virgílio A. F. Almeida,et al.  Beware of What You Share: Inferring Home Location in Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[14]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[15]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[16]  Jason Baldridge,et al.  Supervised Text-based Geolocation Using Language Models on an Adaptive Grid , 2012, EMNLP.

[17]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[18]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[19]  Amanda Lee Hughes,et al.  Crisis in a Networked World , 2009 .

[20]  Leysia Palen,et al.  Twitter adoption and use in mass convergence and emergency events , 2009 .

[21]  Thomas Ertl,et al.  Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages , 2012, 2012 IEEE Pacific Visualization Symposium.

[22]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[23]  Leysia Palen,et al.  Pass it on?: Retweeting in mass emergency , 2010, ISCRAM.