Inferring crowd-sourced venues for tweets

Knowing the geo-located venue of a tweet can facilitate better understanding of a user's geographic context, allowing apps to more precisely present information, recommend services, and target advertisements. However, due to privacy concerns, few users choose to enable geotagging of their tweets, resulting in a small percentage of tweets being geotagged; furthermore, even if the geo-coordinates are available, the closest venue to the geolocation may be incorrect. In this paper, we present a method for providing a ranked list of geo-located venues for a non-geotagged tweet, which simultaneously indicates the venue name and the geo-location at a very fine-grained granularity. In our proposed method for Venue Inference for Tweets (VIT), we construct a heterogeneous social network in order to analyze the embedded social relations, and leverage available but limited geographic data to estimate the geo-located venue of tweets. A single classifier is trained to estimate the probability of a tweet and a geo-located venue being linked, rather than training a separate model for each venue. We examine the performance of four types of social relation features and three types of geographic features embedded in a social network when inferring whether a tweet and a venue are linked, with a best accuracy of over 88%. We use the classifier probability estimates to rank the candidate geo-located venues of a non-geotagged tweet from over 19k possibilities, and observed an average top-5 accuracy of 29%.

[1]  David Allen,et al.  Geotagging one hundred million Twitter accounts with total variation minimization , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[2]  Mor Naaman,et al.  On the Accuracy of Hyper-local Geotagging of Social Media Content , 2014, WSDM.

[3]  Philip S. Yu,et al.  Transferring heterogeneous links across location-based social networks , 2014, WSDM.

[4]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[5]  Jeffrey Nichols,et al.  Where Is This Tweet From? Inferring Home Locations of Twitter Users , 2012, ICWSM.

[6]  Henry A. Kautz,et al.  Finding your friends and following them to where you are , 2012, WSDM '12.

[7]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[8]  Mudhakar Srivatsa,et al.  When twitter meets foursquare: tweet location prediction using foursquare , 2014, MobiQuitous.

[9]  Nan Lin,et al.  SOCIAL NETWORKS AND STATUS ATTAINMENT , 1999 .

[10]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..

[11]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[12]  Xia Wang,et al.  A Location Inferring Model Based on Tweets and Bilateral Follow Friends , 2014, J. Comput..

[13]  Chenliang Li,et al.  Fine-grained location extraction from tweets with temporal awareness , 2014, SIGIR.

[14]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[15]  Michiaki Tatsubori,et al.  Location inference using microblog messages , 2012, WWW.

[16]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[17]  Martha Larson,et al.  The where in the tweet , 2011, CIKM '11.

[18]  Philip S. Yu,et al.  Collective Prediction of Multiple Types of Links in Heterogeneous Information Networks , 2014, 2014 IEEE International Conference on Data Mining.

[19]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[20]  Philip S. Yu,et al.  Multi-label classification by mining label and instance correlations from heterogeneous information networks , 2013, KDD.

[21]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[24]  Tomoko Ohkuma,et al.  Social Media-based Profiling of Business Locations , 2014, GeoMM '14.

[25]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[26]  Henry A. Kautz,et al.  Predicting Disease Transmission from Geo-Tagged Micro-Blog Data , 2012, AAAI.

[27]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[28]  Gisele L. Pappa,et al.  Inferring the Location of Twitter Messages Based on User Relationships , 2011, Trans. GIS.