Predicting Restaurant Consumption Level through Social Media Footprints

Accurate prediction of user attributes from social media is valuable for both social science analysis and consumer targeting. In this paper, we propose a systematic method to leverage user online social media content for predicting offline restaurant consumption level. We utilize the social login as a bridge and construct a dataset of 8,844 users who have been linked across Dianping (similar to Yelp) and Sina Weibo. More specifically, we construct consumption level ground truth based on user self report spending. We build predictive models using both raw features and, especially, latent features, such as topic distributions and celebrities clusters. The employed methods demonstrate that online social media content has strong predictive power for offline spending. Finally, combined with qualitative feature analysis, we present the differences in words usage, topic interests and following behavior between different consumption level groups.

[1]  Yongzheng Zhang,et al.  Predicting purchase behaviors from social media , 2013, WWW.

[2]  Jianyong Wang,et al.  Incorporating heterogeneous information for personalized tag recommendation in social tagging systems , 2012, KDD.

[3]  Steffen Rendle,et al.  Factorization Machines with libFM , 2012, TIST.

[4]  Yang Xiao,et al.  Knowledge Sharing via Social Login: Exploiting Microblogging Service for Warming up Social Question Answering Websites , 2014, COLING.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Sibel Adali,et al.  Actions speak as loud as words: predicting relationships from social behavior data , 2012, WWW.

[7]  Sudeshna Sarkar,et al.  Stylometric Analysis of Bloggers' Age and Gender , 2009, ICWSM.

[8]  Anirban Dasgupta,et al.  Overcoming browser cookie churn with clustering , 2012, WSDM '12.

[9]  Bo Zhao,et al.  Probabilistic topic models with biased propagation on heterogeneous information networks , 2011, KDD.

[10]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[11]  Jiawei Han,et al.  Modeling and exploiting heterogeneous bibliographic networks for expertise ranking , 2012, JCDL '12.

[12]  Derek Ruths,et al.  Gender Inference of Twitter Users in Non-English Contexts , 2013, EMNLP.

[13]  C. Barr Taylor,et al.  Evaluation of computerized text analysis in an Internet breast cancer support group , 2005, Comput. Hum. Behav..

[14]  Cindy K. Chung,et al.  The development of the Chinese linguistic inquiry and word count dictionary. , 2012 .

[15]  Reza Zafarani,et al.  Connecting Corresponding Identities across Communities , 2009, ICWSM.

[16]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[17]  Zhiyuan Liu,et al.  PRISM: Profession Identification in Social Media with Personal Information and Community Structure , 2015, SMP.

[18]  Jacob Eisenstein,et al.  Confounds and Consequences in Geotagged Twitter Data , 2015, EMNLP.

[19]  K. Pauwels,et al.  Effects of Word-of-Mouth versus Traditional Marketing: Findings from an Internet Social Networking Site , 2009 .

[20]  Hua Li,et al.  Demographic prediction based on user's browsing behavior , 2007, WWW '07.

[21]  Nikolaos Aletras,et al.  An analysis of the user occupational class through Twitter content , 2015, ACL.

[22]  SooCheong Jang,et al.  Understanding travel expenditure patterns: a study of Japanese pleasure travelers to the United States by income level , 2004 .

[23]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[24]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[25]  Svitlana Volkova,et al.  Inferring User Political Preferences from Streaming Communications , 2014, ACL.

[26]  Dong Nguyen,et al.  "How Old Do You Think I Am?" A Study of Language and Age in Twitter , 2013, ICWSM.

[27]  Yoram Bachrach,et al.  Studying User Income through Language, Behaviour and Affect in Social Media , 2015, PloS one.

[28]  David Yarowsky,et al.  Hierarchical Bayesian Models for Latent Attribute Detection in Social Media , 2011, ICWSM.

[29]  Sofiane Abbar,et al.  You Tweet What You Eat: Studying Food Consumption Through Twitter , 2014, CHI.

[30]  N. Wong,et al.  Personal taste and family face: Luxury consumption in Confucian and western societies , 1998 .

[31]  Erick Cantú-Paz,et al.  Personalized click prediction in sponsored search , 2010, WSDM '10.

[32]  Kwan Hui Lim,et al.  Following the follower: detecting communities with common interests on twitter , 2012, HT '12.

[33]  Xiaoming Li,et al.  Leveraging Product Adopter Information from Online Reviews for Product Recommendation , 2015, ICWSM.

[34]  Ingmar Weber,et al.  The demographics of web search , 2010, SIGIR.

[35]  Rajat Raina,et al.  Learning relevance from heterogeneous social network and its application in online targeting , 2011, SIGIR.

[36]  Ingemar J. Cox,et al.  Inferring the Socioeconomic Status of Social Media Users Based on Behaviour and Language , 2016, ECIR.

[37]  Clayton Fink,et al.  Inferring Gender from the Content of Tweets: A Region Specific Example , 2012, ICWSM.

[38]  Alexander J. Smola,et al.  Like like alike: joint friendship and interest propagation in social networks , 2011, WWW.

[39]  Mike Brewer,et al.  Measuring living standards with income and consumption: evidence from the UK , 2012 .

[40]  Dale Fodness Measuring tourist motivation. , 1994 .

[41]  D. Ruths,et al.  What's in a Name? Using First Names as Features for Gender Inference in Twitter , 2013, AAAI Spring Symposium: Analyzing Microtext.

[42]  Wendy Liu,et al.  Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[43]  J. Pennebaker,et al.  Linguistic styles: language use as an individual difference. , 1999, Journal of personality and social psychology.

[44]  Ana-Maria Popescu,et al.  Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.

[45]  Alois Stutzer,et al.  The Role of Income Aspirations in Individual Happiness , 2003 .

[46]  Fan Zhang,et al.  What's in a name?: an unsupervised approach to link users across communities , 2013, WSDM.