Fast and Fine-Grained Geolocalisation of Non-Geotagged Tweets

The rise in the use of social networks in the recent years has resulted in an abundance of information on different aspects of everyday social activities that is available online, with the most prominent and timely source of such information being Twitter. This has resulted in a proliferation of tools and applications that can help end-users and large-scale event organizers to better plan and manage their activities. In this process of analysis of the information originating from social networks, an important aspect is that of the geographic coordinates, i.e., geolocalisation, of the relevant information, which is necessary for several applications (e.g., on trending venues, traffic jams, etc.). Unfortunately, only a very small percentage of the twitter posts are geotagged, which significantly restricts the applicability and utility of such applications. In this work, we address this problem by proposing a framework for geolocating tweets that are not geotagged. Our solution is general, and estimates the location from which a post was generated by exploiting the similarities in the content between this post and a set of geotagged tweets, as well as their time-evolution characteristics. Contrary to previous approaches, our framework aims at providing accurate geolocation estimates at fine grain (i.e., within a city). The experimental evaluation with real data demonstrates the efficiency and effectiveness of our approach.

[1]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[2]  Pavel Serdyukov,et al.  Placing flickr photos on a map , 2009, SIGIR.

[3]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[4]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[5]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[6]  Themis Palpanas,et al.  Scalable discovery of contradictions on the web , 2010, WWW '10.

[7]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[8]  Vanessa Murdock,et al.  Your mileage may vary: on the limits of social media , 2011, SIGSPACIAL.

[9]  Sharon Myrtle Paradesi,et al.  Geotagging Tweets Using Their Content , 2011, FLAIRS.

[10]  Themis Palpanas,et al.  Survey on mining subjective data on the web , 2011, Data Mining and Knowledge Discovery.

[11]  Scalable Detection of Sentiment-Based Contradictions , 2011 .

[12]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[13]  Dongwon Lee,et al.  @Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[14]  Steven Schockaert,et al.  Using social media to find places of interest: a case study , 2012, GEOCROWD '12.

[15]  Michiaki Tatsubori,et al.  Location inference using microblog messages , 2012, WWW.

[16]  Michelle R. Guy,et al.  Twitter earthquake detection: earthquake monitoring in a social world , 2012 .

[17]  Víctor Soto,et al.  Characterizing Urban Landscapes Using Geolocated Tweets , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[18]  Daniel Gatica-Perez,et al.  From Foursquare to My Square: Learning Check-in Behavior from Multiple Sources , 2013, ICWSM.

[19]  Anthony Stefanidis,et al.  #Earthquake: Twitter as a Distributed Sensor System , 2013, Trans. GIS.

[20]  Sihem Amer-Yahia,et al.  Efficient sentiment correlation for large-scale demographics , 2013, SIGMOD '13.

[21]  Emanuele Della Valle,et al.  Social Listening of City Scale Events Using the Streaming Linked Data Framework , 2013, SEMWEB.

[22]  Max Mühlhäuser,et al.  A Multi-Indicator Approach for Geolocalization of Tweets , 2013, ICWSM.

[23]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[24]  P. Paraskevopoulos,et al.  Identification and Characterization of Human Behavior Patterns from Mobile Phone Data , 2013 .

[25]  Shaowen Wang,et al.  Mapping the global Twitter heartbeat: The geography of Twitter , 2013, First Monday.

[26]  Nadia Magnenat-Thalmann,et al.  Who, where, when and what: discover spatio-temporal topics for twitter users , 2013, KDD.

[27]  Yi Huang,et al.  A Case Study of Active, Continuous and Predictive Social Media Analytics for Smart City , 2014, S4SC@ISWC.

[28]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..

[29]  Themis Palpanas,et al.  Dynamics of news events and social media reaction , 2014, KDD.

[30]  Chenliang Li,et al.  Fine-grained location extraction from tweets with temporal awareness , 2014, SIGIR.

[31]  Themis Palpanas,et al.  NIA: System for News Impact Analytics , 2014 .

[32]  Weiru Liu,et al.  A survey of location inference techniques on Twitter , 2015, J. Inf. Sci..

[33]  Reza Zafarani,et al.  Evaluation without ground truth in social media research , 2015, Commun. ACM.

[34]  Themis Palpanas,et al.  Fine-grained geolocalisation of non-geotagged tweets , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[35]  Jiebo Luo,et al.  Precise Localization of Homes and Activities: Detecting Drinking-While-Tweeting Patterns in Communities , 2016, ICWSM.

[36]  Themis Palpanas,et al.  When a Tweet Finds its Place: Fine-Grained Tweet Geolocalisation , 2016, SoGood@ECML-PKDD.