Hyperlocal Home Location Identification of Twitter Profiles

Knowledge of user's location provides valuable information that can be used to build region-specific models (e.g. language used in a particular region and map-based visualisations of social media posts). Determining a user's home location presents a challenge. Current approaches make use of geo-located tweets or textual cues but are often only able to predict location to a coarse level of granularity (e.g. city level), while many applications require finer-grained (hyperlocal) predictions. A novel approach for hyperlocal home location identification, based on clustering of geo-located tweets, is presented. A gold-standard data set for home location identification is developed by making use of indicative phrases in geo-located tweets. We find that the cluster-based approaches outperform current techniques for hyperlocal location prediction.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[3]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[4]  Kalina Bontcheva,et al.  User profiling with geo-located posts and demographic data , 2016, NLP+CSS@EMNLP.

[5]  Jason Baldridge,et al.  Supervised Text-based Geolocation Using Language Models on an Adaptive Grid , 2012, EMNLP.

[6]  Jeffrey Nichols,et al.  Where Is This Tweet From? Inferring Home Locations of Twitter Users , 2012, ICWSM.

[7]  David Allen,et al.  Geotagging one hundred million Twitter accounts with total variation minimization , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[8]  Anthony C. Robinson,et al.  Leveraging geospatially-oriented social media communications in disaster response , 2012, ISCRAM.

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  M. Anand “1984” , 1962 .

[11]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[12]  Jiajun Liu,et al.  Understanding Human Mobility from Twitter , 2014, PloS one.

[13]  Timothy Baldwin,et al.  Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text , 2016, NUT@COLING.

[14]  Timothy Baldwin,et al.  Twitter User Geolocation Using a Unified Text and Network Prediction Model , 2015, ACL.

[15]  Derek Ruths,et al.  Geolocation Prediction in Twitter Using Social Networks: A Critical Analysis and Review of Current Practice , 2015, ICWSM.

[16]  Kalina Bontcheva,et al.  Where's @wally?: a classification approach to geolocating users based on their social ties , 2013, HT '13.

[17]  Dongwon Lee,et al.  @Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[18]  Barry Smyth,et al.  Using twitter to recommend real-time topical news , 2009, RecSys '09.

[19]  Beatrice Alex,et al.  Homing in on Twitter Users: Evaluating an Enhanced Geoparser for User Profile Locations , 2016, LREC.

[20]  Fahad Bin Muhaya,et al.  Estimating Twitter User Location Using Social Interactions--A Content Based Approach , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[21]  David Jurgens,et al.  That's What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships , 2013, ICWSM.

[22]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..

[23]  A. Culotta,et al.  Using County Demographics to Infer Attributes of Twitter Users , 2014 .

[24]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[25]  Emily T. Metzgar,et al.  Defining hyperlocal media: Proposing a framework for discussion , 2011, New Media Soc..

[26]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[27]  Jacob Eisenstein,et al.  Confounds and Consequences in Geotagged Twitter Data , 2015, EMNLP.

[28]  Michael F. Goodchild,et al.  Please Scroll down for Article International Journal of Digital Earth Crowdsourcing Geographic Information for Disaster Response: a Research Frontier Crowdsourcing Geographic Information for Disaster Response: a Research Frontier , 2022 .

[29]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[30]  Shelly Farnham,et al.  Whoo.ly: facilitating information seeking for hyperlocal communities using social media , 2013, CHI.

[31]  Jeffrey Nichols,et al.  Home Location Identification of Twitter Users , 2014, TIST.

[32]  Brendan T. O'Connor,et al.  Demographic Dialectal Variation in Social Media: A Case Study of African-American English , 2016, EMNLP.

[33]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[34]  R. Sinnott Virtues of the Haversine , 1984 .

[35]  Timothy Baldwin,et al.  Exploiting Text and Network Context for Geolocation of Social Media Users , 2015, NAACL.

[36]  B. Heravi Tweet Location Detection , 2015 .

[37]  Timothy Baldwin,et al.  Geolocation Prediction in Social Media Data by Finding Location Indicative Words , 2012, COLING.

[38]  T. Vincenty DIRECT AND INVERSE SOLUTIONS OF GEODESICS ON THE ELLIPSOID WITH APPLICATION OF NESTED EQUATIONS , 1975 .

[39]  Dirk Hovy,et al.  Personality Traits on Twitter—or—How to Get 1,500 Personality Tests in a Week , 2015, WASSA@EMNLP.

[40]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[41]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[42]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[43]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.