Privacy-Preserving Social Media Data Publishing for Personalized Ranking-Based Recommendation

Personalized recommendation is crucial to help users find pertinent information. It often relies on a large collection of user data, in particular users’ online activity (e.g., tagging/rating/checking-in) on social media, to mine user preference. However, releasing such user activity data makes users vulnerable to inference attacks, as private data (e.g., gender) can often be inferred from the users’ activity data. In this paper, we proposed PrivRank, a customizable and continuous privacy-preserving social media data publishing framework protecting users against inference attacks while enabling personalized ranking-based recommendations. Its key idea is to continuously obfuscate user activity data such that the privacy leakage of user-specified private data is minimized under a given data distortion budget, which bounds the ranking loss incurred from the data obfuscation process in order to preserve the utility of the data for enabling recommendations. An empirical evaluation on both synthetic and real-world datasets shows that our framework can efficiently provide effective and continuous protection of user-specified private data, while still preserving the utility of the obfuscated data for personalized ranking-based recommendation. Compared to state-of-the-art approaches, PrivRank achieves both a better privacy protection and a higher utility in all the ranking-based recommendation use cases we tested.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  Benjamin C. M. Fung,et al.  Publishing set-valued data via differential privacy , 2011, Proc. VLDB Endow..

[3]  Daqing Zhang,et al.  Modeling User Activity Preference by Leveraging User Spatial Temporal Characteristics in LBSNs , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Zhu Wang,et al.  A sentiment-enhanced personalized location recommendation system , 2013, HT.

[5]  Xiao Han,et al.  Location Privacy-Preserving Task Allocation for Mobile Crowdsensing with Differential Geo-Obfuscation , 2017, WWW.

[6]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[7]  Nina Taft,et al.  How to hide the elephant- or the donkey- in the room: Practical privacy against statistical inference for large data , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[8]  Daqing Zhang,et al.  PrivCheck: privacy-preserving check-in data publishing for personalized location based services , 2016, UbiComp.

[9]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[10]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[12]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[13]  Bin Guo,et al.  Personalized Travel Package With Multi-Point-of-Interest Recommendation Based on Crowdsourced User Footprints , 2016, IEEE Transactions on Human-Machine Systems.

[14]  Nina Taft,et al.  Managing Your Private and Public Data: Bringing Down Inference Attacks Against Your Privacy , 2014, IEEE Journal of Selected Topics in Signal Processing.

[15]  Stephen P. Boyd,et al.  Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[16]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[17]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[18]  Charles X. Ling,et al.  AUC: A Better Measure than Accuracy in Comparing Learning Algorithms , 2003, Canadian Conference on AI.

[19]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[20]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[21]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[22]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[23]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[24]  B. Kveton,et al.  PriView : Media Consumption and Recommendation Meet Privacy Against Inference Attacks , 2014 .

[25]  Daqing Zhang,et al.  Participatory Cultural Mapping Based on Collective Behavior Data in Location-Based Social Networks , 2016, ACM Trans. Intell. Syst. Technol..

[26]  Rong Jin,et al.  Top Rank Optimization in Linear Time , 2014, NIPS.

[27]  Branislav Kveton,et al.  PriView: Personalized Media Consumption Meets Privacy against Inference Attacks , 2015, IEEE Software.

[28]  H. Vincent Poor,et al.  Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach , 2011, IEEE Transactions on Information Forensics and Security.

[29]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[30]  Jayant R. Haritsa,et al.  A Framework for High-Accuracy Privacy-Preserving Mining , 2005, ICDE.

[31]  Daqing Zhang,et al.  Fine-grained preference-aware location search leveraging crowdsourced digital footprints from LBSNs , 2013, UbiComp.

[32]  Xiaochun Yang,et al.  Protecting Individual Information Against Inference Attacks in Data Publishing , 2007, DASFAA.

[33]  Kyumin Lee,et al.  Exploring Millions of Footprints in Location Sharing Services , 2011, ICWSM.

[34]  Daqing Zhang,et al.  NationTelescope: Monitoring and visualizing large-scale collective behavior in LBSNs , 2015, J. Netw. Comput. Appl..

[35]  Tie-Yan Liu,et al.  Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[36]  Norman A. Johnson,et al.  Personality traits and concern for privacy: an empirical study in the context of location-based services , 2008, Eur. J. Inf. Syst..

[37]  Bruce G. Lindsay,et al.  Approximate medians and other quantiles in one pass and with limited memory , 1998, SIGMOD '98.

[38]  Guoliang Xue,et al.  Checking in without worries: Location privacy in location based social networks , 2013, 2013 Proceedings IEEE INFOCOM.

[39]  Sampath Kannan,et al.  The Exponential Mechanism for Social Welfare: Private, Truthful, and Nearly Optimal , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.