High-Resolution Home Location Prediction from Tweets Using Deep Learning with Dynamic Structure

Timely and high-resolution estimates of the home locations of a sufficiently large subset of the population are critical for applications such as disaster response and public health. However, conventional data sources, such as census and surveys, have a substantial time lag and cannot capture seasonal trends. Recently, the large user-base and real-time nature of social media data have been leveraged to address this problem. However, inherent sparsity and noise, along with large estimation uncertainty in home locations, have limited their effectiveness. In this paper, we develop a deep-learning solution that deals with the sparsity and noise of social media data. We obtained over 90% accuracy for large subsets on a commonly used dataset. Systematic comparisons show that our method gives the highest accuracy both for the entire sample and for subsets.

[1]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[2]  Jeffrey Nichols,et al.  Home Location Identification of Twitter Users , 2014, TIST.

[3]  Margaret Martonosi,et al.  Identifying Important Places in People's Lives from Cellular Network Data , 2011, Pervasive.

[4]  Sharon Heys,et al.  Challenges and Potential Opportunities of Mobile Phone Call Detail Records in Health Research: Review , 2018, JMIR mHealth and uHealth.

[5]  Yuk Ying Chung,et al.  Top-Down Person Re-Identification With Siamese Convolutional Neural Networks , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[6]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[7]  Joanne Turner,et al.  Global risk model for vector-borne transmission of Zika virus reveals the role of El Niño 2015 , 2016, Proceedings of the National Academy of Sciences.

[8]  Virgílio A. F. Almeida,et al.  We know where you live: privacy characterization of foursquare behavior , 2012, UbiComp.

[9]  Derya Birant,et al.  ST-DBSCAN: An algorithm for clustering spatial-temporal data , 2007, Data Knowl. Eng..

[10]  Daniele Quercia,et al.  The Hidden Image of the City: Sensing Community Well-Being from Urban Mobility , 2012, Pervasive.

[11]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[12]  Jiebo Luo,et al.  Home Location Inference from Sparse and Noisy Data: Models and Applications , 2015, ICDM Workshops.

[13]  Jiebo Luo,et al.  Precise Localization of Homes and Activities: Detecting Drinking-While-Tweeting Patterns in Communities , 2016, ICWSM.

[14]  Jiebo Luo,et al.  Towards Lifestyle Understanding: Predicting Home and Vacation Locations from User's Online Photo Collections , 2015, ICWSM.

[15]  Ed H. Chi,et al.  Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles , 2011, CHI.

[16]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..

[17]  Jose J. Padilla,et al.  Fine-Scale Prediction of People's Home Location Using Social Media Footprints , 2018, SBP-BRiMS.

[18]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[19]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[20]  Xin Lu,et al.  Population mobility reductions associated with travel restrictions during the Ebola epidemic in Sierra Leone: use of mobile phone data , 2018, International journal of epidemiology.

[21]  Jason I. Hong,et al.  Our House, in the Middle of Our Tweets , 2021, ICWSM.

[22]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[23]  Nasser Ghadiri,et al.  Ambiguity-driven fuzzy C-means clustering: how to detect uncertain clustered records , 2014, Applied Intelligence.

[24]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[25]  Kenth Engø-Monsen,et al.  Impact of human mobility on the emergence of dengue epidemics in Pakistan , 2015, Proceedings of the National Academy of Sciences.

[26]  Weiru Liu,et al.  A survey of location inference techniques on Twitter , 2015, J. Inf. Sci..

[27]  Virgílio A. F. Almeida,et al.  Beware of What You Share: Inferring Home Location in Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[28]  Aixin Sun,et al.  A Survey of Location Prediction on Twitter , 2017, IEEE Transactions on Knowledge and Data Engineering.

[29]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[30]  Nikhil Ketkar,et al.  Deep Learning with Python , 2017 .

[31]  Jeffrey Nichols,et al.  Where Is This Tweet From? Inferring Home Locations of Twitter Users , 2012, ICWSM.