Towards unsupervised home location inference from online social media

Users' home location is important information for many advanced information services in big data applications (e.g., localized recommendation, target ads of local business and urban planning). In this paper, we study the problem of accurately inferring the home locations of people from the noisy and sparse data they voluntarily share on online social media. Previous studies have developed supervised learning approaches to predict a person's home location in a city. However, the accuracy of these techniques largely depends on a high quality training dataset, which is difficult and expensive to obtain in practice. In this study, we propose a new analytical framework, Unsupervised Home Location Inference (UHLI), to accurately infer the home locations of people using a set of principle approaches. In particular, the UHLI scheme addresses the critical challenges of using sparse and noisy online social media data and derives an optimal solution to the home location inference problem. We evaluated the performance of our scheme and compared it to the state-of-the-art baselines using three real world data traces collected from Foursquare. The results showed that our scheme can accurately infer the home location of people and significantly outperform the state-of-the-art baselines.

[1]  Jeffrey Nichols,et al.  Home Location Identification of Twitter Users , 2014, TIST.

[2]  Pamela J. Wisniewski,et al.  "Preventative" vs. "Reactive": How Parental Mediation Influences Teens' Social Media Privacy Behaviors , 2015, CSCW.

[3]  Philip S. Yu,et al.  Inferring the impacts of social media on crowdfunding , 2014, WSDM.

[4]  Matthew Richardson,et al.  Towards Decision Support and Goal Achievement: Identifying Action-Outcome Relationships From Social Media , 2015, KDD.

[5]  Dong Wang,et al.  Mood-Sensitive Truth Discovery For Reliable Recommendation Systems in Social Sensing , 2016, RecSys.

[6]  Chao Huang,et al.  Exploiting spatial-temporal-social constraints for localness inference using online social media , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[7]  Cecilia Mascolo,et al.  Exploiting place features in link prediction on location-based social networks , 2011, KDD.

[8]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[9]  Rui Wang,et al.  Towards social user profiling: unified and discriminative influence model for inferring home locations , 2012, KDD.

[10]  Henry A. Kautz,et al.  Modeling the impact of lifestyle on health at scale , 2013, WSDM.

[11]  Jiebo Luo,et al.  Home Location Inference from Sparse and Noisy Data: Models and Applications , 2015, ICDM Workshops.

[12]  Qi Gao,et al.  Semantic Enrichment of Twitter Posts for User Profile Construction on the Social Web , 2011, ESWC.

[13]  Alexander J. Smola,et al.  Scalable distributed inference of dynamic user interests for behavioral targeting , 2011, KDD.

[14]  Nicholas Christakis,et al.  The Taste for Privacy: An Analysis of College Student Privacy Settings in an Online Social Network , 2008, J. Comput. Mediat. Commun..

[15]  V. Papathanasiou Some characteristic properties of the Fisher information matrix via Cacoullos-type inequalities , 1993 .

[16]  Zhao Yuping A Novel Anti-Collision Protocol in Multiple Readers RFID Sensor Networks , 2008 .

[17]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[18]  Chi-Yin Chow,et al.  GeoSoCa: Exploiting Geographical, Social and Categorical Correlations for Point-of-Interest Recommendations , 2015, SIGIR.

[19]  Charu C. Aggarwal,et al.  On Credibility Estimation Tradeoffs in Assured Social Sensing , 2013, IEEE Journal on Selected Areas in Communications.

[20]  Chao Huang,et al.  On Interesting Place Finding in Social Sensing: An Emerging Smart City Application Paradigm , 2015, 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity).

[21]  Kevin Chen-Chuan Chang,et al.  User profiling in an ego network: co-profiling attributes and relationships , 2014, WWW.

[22]  Dong Wang,et al.  Towards Emotional-Aware Truth Discovery in Social Sensing Applications , 2016, 2016 IEEE International Conference on Smart Computing (SMARTCOMP).

[23]  Yuping Zhao,et al.  A Novel Fast Anti-Collision Algorithm for RFID Systems , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[24]  Chao Huang,et al.  Confidence-aware truth estimation in social sensing applications , 2015, 2015 12th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[25]  Chao Huang,et al.  Topic-Aware Social Sensing with Arbitrary Source Dependency Graphs , 2016, 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[26]  D. R. K. Brownrigg,et al.  The weighted median filter , 1984, CACM.

[27]  Nitesh V. Chawla,et al.  Towards Time-Sensitive Truth Discovery in Social Sensing Applications , 2015, 2015 IEEE 12th International Conference on Mobile Ad Hoc and Sensor Systems.

[28]  James Caverlee,et al.  Location prediction in social media based on tie strength , 2013, CIKM.

[29]  Rui Li,et al.  Multiple Location Profiling for Users and Relationships from Social Network and Content , 2012, Proc. VLDB Endow..

[30]  Chao Huang,et al.  Theme-Relevant Truth Discovery on Twitter: An Estimation Theoretic Approach , 2016, ICWSM.

[31]  David K. Y. Yau,et al.  Privacy vulnerability of published anonymous mobility traces , 2010, MobiCom.

[32]  Tarek F. Abdelzaher,et al.  Surrogate mobile sensing , 2014, IEEE Communications Magazine.

[33]  Dongre Deepak Mahapatrav,et al.  LARS*: An Efficient and Scalable Location-Aware Recommender System , 2017 .

[34]  Marco Gruteser,et al.  USENIX Association , 1992 .

[35]  Chao Huang,et al.  Unsupervised Interesting Places Discovery in Location-Based Social Sensing , 2016, 2016 International Conference on Distributed Computing in Sensor Systems (DCOSS).

[36]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[37]  Dong Wang,et al.  Hardness-Aware Truth Discovery in Social Sensing Applications , 2016, 2016 International Conference on Distributed Computing in Sensor Systems (DCOSS).

[38]  C. Robusto The Cosine-Haversine Formula , 1957 .

[39]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[40]  Rungang Han,et al.  On robust truth discovery in sparse social media sensing , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[41]  Cliff Lampe,et al.  A familiar face(book): profile elements as signals in an online social network , 2007, CHI.