On Fine-Grained Geolocalisation of Tweets

Recently, the geolocalisation of tweets has become an important feature for a wide range of tasks in Information Retrieval and other domains, such as real-time event detection, topic detection or disaster and emergency analysis. However, the number of relevant geo-tagged tweets available remains insufficient to reliably perform such tasks. Thus, predicting the location of non-geotagged tweets is an important yet challenging task, which can increase the sample of geo-tagged data and help to a wide range of tasks. In this paper, we propose a location inference method that utilises a ranking approach combined with a majority voting of tweets weighted based on the credibility of its source (Twitter user). Using geo-tagged tweets from two cities, Chicago and New York (USA), our experimental results demonstrate that our method (statistically) significantly outperforms our baselines in terms of accuracy, and error distance, in both cities, with the cost of decrease in recall.

[1]  Peng Zhang,et al.  Estimating the Locations of Emergency Events from Twitter Streams , 2014, ITQM.

[2]  Themis Palpanas,et al.  Fine-grained geolocalisation of non-geotagged tweets , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[3]  Scott A. Hale,et al.  Where in the World Are You? Geolocation and Language Identification in Twitter* , 2013, ArXiv.

[4]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[5]  Dongwon Lee,et al.  @Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[6]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[7]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[8]  Mor Naaman,et al.  On the Accuracy of Hyper-local Geotagging of Social Media Content , 2014, WSDM.

[9]  Craig MacDonald,et al.  EAIMS: Emergency Analysis Identification and Management System , 2016, SIGIR.

[10]  Shou-De Lin,et al.  A Ranking-based KNN Approach for Multi-Label Classification , 2012, ACML.

[11]  Themis Palpanas,et al.  Where has this tweet come from? Fast and fine-grained geolocalization of non-geotagged tweets , 2016, Social Network Analysis and Mining.

[12]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[13]  Mawloud Mosbah,et al.  Majority Voting Re-ranking Algorithm for Content Based-Image Retrieval , 2015, MTSR.

[14]  Xiaozhong Liu,et al.  Mirroring the real world in social media: twitter, geolocation, and sentiment analysis , 2013, UnstructureNLP@CIKM.

[15]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[16]  Meredith Ringel Morris,et al.  #TwitterSearch: a comparison of microblog search and web search , 2011, WSDM '11.

[17]  Timothy Baldwin,et al.  A Stacking-based Approach to Twitter User Geolocation Prediction , 2013, ACL.

[18]  Joemon M. Jose,et al.  On Microblog Dimensionality and Informativeness: Exploiting Microblogs' Structure and Dimensions for Ad-Hoc Retrieval , 2015, ICTIR.

[19]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[20]  Jason Baldridge,et al.  Supervised Text-based Geolocation Using Language Models on an Adaptive Grid , 2012, EMNLP.

[21]  Jason Baldridge,et al.  Simple supervised document geolocation with geodesic grids , 2011, ACL.

[22]  Max Mühlhäuser,et al.  A Multi-Indicator Approach for Geolocalization of Tweets , 2013, ICWSM.

[23]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[24]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..