"I'm Eating a Sandwich in Hong Kong": Modeling Locations with Tweets

Social media such as Twitter generate large quantities of data about what a person is thinking and doing in a particular location. We leverage this data to build models of locations to improve our understanding of a user’s geographic context. Understanding the user’s geographic context in turn allows us to present information, recommend businesses and services, and place advertisements that are relevant at a hyper-local level. In this paper we create language models of locations using coordinates extracted from geotagged Twitter data. We model locations at varying levels of granularity, from zip code to the country level. We measure the accuracy of these models by the degree to which we can predict the location of an individual tweet, and further by the accuracy with which we can predict the location of a user. We nd that we can meet the performance of the industry standard tool for predicting both the tweet and the user, at the country, state and city levels, and far exceed its performance at the hyper-local level, achieving a three- to ten-fold increase in accuracy at the zip code level.