Home is where your friends are: Utilizing the social graph to locate twitter users in a city

Micro-blogging services such as Twitter have gained enormous popularity over the last few years leading to massive volumes of user generated content. A portion of this content is shared via geo-aware mobile devices, such as smartphones. Pieces of information shared on such a device can be tagged with the user's location, conditional on the user's settings. These geostamps enable a number of mainstream applications, such as emergency response, disease tracking, news reporting, and advertising. Unfortunately, informative geostamps are typically sparse, since content is often shared via devices that do not support geo-tagging, such as desktop or laptop computers. In addition, even if a mobile device is used, a flawed geo-location service can lead to missing geostamps, or geostamps that are too general to be informative. In this work, we address this sparsity issue via a new approach that identifies users attached to a given location of interest, such as a city. We then focus on retrieving specific tweets at a finer granularity within the given location, such as specific blocks within a city. Our approach leverages the correlation between strong connectivity in the social graph and proximity in the real world, while utilizing both textual tweet content and Twitter's underlying social graph. Previous relevant work assumes that all required Twitter data is available without access restrictions. This is an unrealistic assumption, since Twitter limits the number of data requests per user and charges a subscription fee for unrestricted access. Therefore, in order to increase the number of practitioners and applications that can benefit from our work, we optimize our method to work with the minimum amount of queries to the Twitter API. Finally, our experiments demonstrate the efficacy of our work via both a quantitative and qualitative evaluation.

[1]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[2]  Dimitrios Gunopulos,et al.  Addressing the Sparsity of Location Information on Twitter , 2014, EDBT/ICDT Workshops.

[3]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[4]  Martha Larson,et al.  The where in the tweet , 2011, CIKM '11.

[5]  Aristides Gionis,et al.  Customized tour recommendations in urban areas , 2014, WSDM.

[6]  Dimitrios Gunopulos,et al.  On The Spatiotemporal Burstiness of Terms , 2012, Proc. VLDB Endow..

[7]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[8]  Hongfei Lin,et al.  Where Are You Settling Down: Geo-locating Twitter Users Based on Tweets and Social Networks , 2012, AIRS.

[9]  Rui Wang,et al.  Towards social user profiling: unified and discriminative influence model for inferring home locations , 2012, KDD.

[10]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[11]  Michael D. Smith,et al.  Location, Location, Location: An Analysis of Profitability of Position in Online Advertising Markets , 2008 .

[12]  John D. Lafferty,et al.  Document Language Models, Query Models, and Risk Minimization for Information Retrieval , 2001, SIGIR Forum.

[13]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[14]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[15]  Henry A. Kautz,et al.  Modeling Spread of Disease from Social Interactions , 2012, ICWSM.

[16]  Marco Rosa,et al.  Four degrees of separation , 2011, WebSci '12.

[17]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[18]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[19]  Piotr Jankowski,et al.  Extracting Personal Behavioral Patterns from Geo-Referenced Tweets , 2013 .

[20]  George Valkanas,et al.  Location Extraction from Social Networks with Commodity Software and Online Data , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[21]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[22]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[23]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[24]  Kyle Luh,et al.  Community Detection Using Spectral Clustering on Sparse Geosocial Data , 2012, SIAM J. Appl. Math..

[25]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[26]  Norman M. Sadeh,et al.  The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City , 2012, ICWSM.

[27]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[28]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[29]  Alexander J. Smola,et al.  Hierarchical geographical modeling of user locations from social media posts , 2013, WWW.

[30]  Marian Scott,et al.  When is the right time? , 2018, Veterinary Record.

[31]  Konstantinos Pelechrinis,et al.  Safe Navigation in Urban Environments , 2015, KDD 2015.

[32]  Michael D. Barnes,et al.  "Right Time, Right Place" Health Communication on Twitter: Value and Accuracy of Location Information , 2012, Journal of medical Internet research.

[33]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[34]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[35]  Rizal Setya Perdana What is Twitter , 2013 .

[36]  Jeffrey Nichols,et al.  Where Is This Tweet From? Inferring Home Locations of Twitter Users , 2012, ICWSM.

[37]  Padhraic Smyth,et al.  Modeling human location data with mixtures of kernel densities , 2014, KDD.

[38]  Michiaki Tatsubori,et al.  Location inference using microblog messages , 2012, WWW.

[39]  Danah Boyd,et al.  Tweeting from the Town Square: Measuring Geographic Local Networks , 2010, ICWSM.

[40]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.