Privacy-preserving topic model for tagging recommender systems

Tagging recommender systems provide users the freedom to explore tags and obtain recommendations. The releasing and sharing of these tagging datasets will accelerate both commercial and research work on recommender systems. However, releasing the original tagging datasets is usually confronted with serious privacy concerns, because adversaries may re-identify a user and her/his sensitive information from tagging datasets with only a little background information. Recently, several privacy techniques have been proposed to address the problem, but most of these lack a strict privacy notion, and rarely prevent individuals being re-identified from the dataset. This paper proposes a privacy- preserving tag release algorithm, PriTop. This algorithm is designed to satisfy differential privacy, a strict privacy notion with the goal of protecting users in a tagging dataset. The proposed PriTop algorithm includes three privacy-preserving operations: Private topic model generation structures the uncontrolled tags; private weight perturbation adds Laplace noise into the weights to hide the numbers of tags; while private tag selection finally finds the most suitable replacement tags for the original tags, so the exact tags can be hidden. We present extensive experimental results on four real-world datasets, Delicious, MovieLens, Last.fm and BibSonomy. While the recommendation algorithm is successful in all the cases, our results further suggest the proposed PriTop algorithm can successfully retain the utility of the datasets while preserving privacy.

[1]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[2]  Xinyuan Liang Reasoning Algorithm of Multi-Value Fuzzy Causality Diagram Based on Unitizing Coefficient , 2007 .

[3]  Andreas Hotho,et al.  Tag Recommendations in Folksonomies , 2007, LWA.

[4]  Jordi Forné,et al.  Measuring the privacy of user profiles in personalized information systems , 2014, Future Gener. Comput. Syst..

[5]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[8]  Naren Ramakrishnan,et al.  Privacy Risks in Recommender Systems , 2001, IEEE Internet Comput..

[9]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[10]  Wenliang Du,et al.  Privacy-preserving collaborative filtering using randomized perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[11]  Wenliang Du,et al.  Achieving Private Recommendations Using Randomized Response Techniques , 2006, PAKDD.

[12]  Tsan-sheng Hsu,et al.  Privacy-Preserving Collaborative Recommender Systems , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[14]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[15]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[16]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[17]  Yanjiao Chen,et al.  SpringerBriefs in Electrical and Computer Engineering , 2015 .

[18]  Tianqing Zhu,et al.  Differential privacy for neighborhood-based Collaborative Filtering , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[19]  John F. Canny,et al.  Collaborative filtering with privacy via factor analysis , 2002, SIGIR '02.

[20]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[21]  Tsvi Kuflik,et al.  Enhancing privacy and preserving accuracy of a distributed collaborative filtering , 2007, RecSys '07.

[22]  Jordi Forné,et al.  Privacy-Preserving Enhanced Collaborative Tagging , 2014, IEEE Transactions on Knowledge and Data Engineering.

[23]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[24]  Dingzhu Du,et al.  Proceedings of the 5th international conference on Theory and applications of models of computation , 2006 .

[25]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[26]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[27]  Ralf Krestel,et al.  Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.

[28]  Bamshad Mobasher,et al.  Personalized recommendation in social tagging systems using hierarchical clustering , 2008, RecSys '08.

[29]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[31]  Vitaly Shmatikov,et al.  How To Break Anonymity of the Netflix Prize Dataset , 2006, ArXiv.

[32]  Douglas M. Blough,et al.  Privacy Preserving Collaborative Filtering Using Data Obfuscation , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[33]  Panagiotis Symeonidis,et al.  Tag recommendations based on tensor dimensionality reduction , 2008, RecSys '08.