Inferring Private Demographics of New Users in Recommender Systems

With the growing number of wireless and mobile devices ingrained into our daily lives, more and more people are interacting with online services that adopt recommender systems to suggest movies, news and points of interest. The private demographics of users such as age and gender in online recommender systems are very useful for many applications such as personalized ads, social study and marketing. However, users do not always provide details in their online profiles due to privacy concern. Most existing approaches can infer user private attributes based on sufficient interaction history but could fail for new users with few ratings. In this paper, we present a novel preference elicitation method, with which a recommender system asks cold-start users to rate selected items adaptively and infer the demographics rapidly via a few interactions. Specifically, latent user profiles are learned across the tasks of demographic inference and rating prediction simultaneously, which enables knowledge transfer through the two related tasks and improves the prediction accuracy for both tasks. The proposed method can also facilitate the understanding of the tradeoff between user privacy and the utility of personalization. Experimental results on real-world datasets demonstrate the performance of the proposed method in terms of the accuracy of both demographics inference and rating prediction.

[1]  John Riedl,et al.  Learning preferences of new users in recommender systems: an information theoretic approach , 2008, SKDD.

[2]  Shuang-Hong Yang,et al.  Functional matrix factorizations for cold-start recommendation , 2011, SIGIR.

[3]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[4]  Craig Boutilier,et al.  Active Collaborative Filtering , 2002, UAI.

[5]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[6]  Yehuda Koren,et al.  Factor in the neighbors: Scalable and accurate collaborative filtering , 2010, TKDD.

[7]  Qiang Yang,et al.  Active Dual Collaborative Filtering with Both Item and Attribute Feedback , 2011, AAAI.

[8]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[9]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[10]  Bernd Ludwig,et al.  ROSE: assisting pedestrians to find preferred events and comfortable public transport connections , 2009, Mobility Conference.

[11]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[12]  Milad Shokouhi,et al.  Inferring the demographics of search users: social data meets search queries , 2013, WWW.

[13]  Yehuda Koren,et al.  Adaptive bootstrapping of recommender systems using decision trees , 2011, WSDM '11.

[14]  Lise Getoor,et al.  To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles , 2009, WWW '09.

[15]  Sean M. McNee,et al.  Getting to know you: learning new user preferences in recommender systems , 2002, IUI '02.

[16]  Ben Y. Zhao,et al.  Preserving privacy in location-based mobile social applications , 2010, HotMobile '10.

[17]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[18]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[19]  Ana-Maria Popescu,et al.  Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.

[20]  Bernd Ludwig,et al.  Context relevance assessment and exploitation in mobile recommender systems , 2012, Personal and Ubiquitous Computing.

[21]  Filip De Turck,et al.  Interest based selection of user generated content for rich communication services , 2010, J. Netw. Comput. Appl..

[22]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[23]  Yiming Yang,et al.  Personalized active learning for collaborative filtering , 2008, SIGIR '08.

[24]  Luo Si,et al.  A Bayesian Approach toward Active Learning for Collaborative Filtering , 2004, UAI.

[25]  F. Ricci,et al.  Map-Based Interaction with a Conversational Mobile Recommender System , 2008, 2008 The Second International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies.

[26]  Stratis Ioannidis,et al.  Recommending with an agenda: active learning of private attributes using matrix factorization , 2013, RecSys '14.

[27]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[28]  Guillaume Bouchard,et al.  Robust Bayesian Matrix Factorisation , 2011, AISTATS.

[29]  Nicholas Jing Yuan,et al.  You Are Where You Go: Inferring Demographic Attributes from Location Check-ins , 2015, WSDM.

[30]  Stratis Ioannidis,et al.  BlurMe: inferring and obfuscating user gender based on ratings , 2012, RecSys.

[31]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[32]  Mingxuan Sun,et al.  Learning multiple-question decision trees for cold-start recommendation , 2013, WSDM.

[33]  Mayuram S. Krishnan,et al.  The Personalization Privacy Paradox: An Empirical Evaluation of Information Transparency and the Willingness to be Profiled Online for Personalization , 2006, MIS Q..