Towards social user profiling: unified and discriminative influence model for inferring home locations

Users' locations are important to many applications such as targeted advertisement and news recommendation. In this paper, we focus on the problem of profiling users' home locations in the context of social network (Twitter). The problem is nontrivial, because signals, which may help to identify a user's location, are scarce and noisy. We propose a unified discriminative influence model, named as UDI, to solve the problem. To overcome the challenge of scarce signals, UDI integrates signals observed from both social network (friends) and user-centric data (tweets) in a unified probabilistic framework. To overcome the challenge of noisy signals, UDI captures how likely a user connects to a signal with respect to 1) the distance between the user and the signal, and 2) the influence scope of the signal. Based on the model, we develop local and global location prediction methods. The experiments on a large scale data set show that our methods improve the state-of-the-art methods by 13%, and achieve the best performance.

[1]  Jon M. Kleinberg,et al.  Spatial variation in search engine queries , 2008, WWW.

[2]  Alexander J. Smola,et al.  Scalable distributed inference of dynamic user interests for behavioral targeting , 2011, KDD.

[3]  Yong Yu,et al.  Exploring folksonomy for personalized search , 2008, SIGIR '08.

[4]  Rui Li,et al.  Exploring social tagging graph for web object classification , 2009, KDD.

[5]  E. Kandel,et al.  Proceedings of the National Academy of Sciences of the United States of America. Annual subject and author indexes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[7]  Jiawei Han,et al.  Geographical topic discovery and comparison , 2011, WWW.

[8]  Ryen W. White,et al.  Predicting user interests from contextual information , 2009, SIGIR.

[9]  Teofilo F. Gonzalez,et al.  Handbook of Approximation Algorithms and Metaheuristics (Chapman & Hall/Crc Computer & Information Science Series) , 2007 .

[10]  Feng Qiu,et al.  Automatic identification of user interest for personalized search , 2006, WWW '06.

[11]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[12]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[13]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[14]  Foster Provost,et al.  Audience selection for on-line brand advertising: privacy-friendly social network targeting , 2009, KDD.

[15]  Sergej Sizov,et al.  GeoFolk: latent spatial semantics in web 2.0 social media , 2010, WSDM '10.

[16]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[17]  Fahad Bin Muhaya,et al.  Estimating Twitter User Location Using Social Interactions--A Content Based Approach , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[18]  Evripidis Bampis,et al.  Handbook of Approximation Algorithms and Metaheuristics , 2007 .

[19]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[20]  Alexander J. Smola,et al.  Like like alike: joint friendship and interest propagation in social networks , 2011, WWW.

[21]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.