k-CoRating: Filling Up Data to Obtain Privacy and Utility

For datasets in Collaborative Filtering (CF) recommendations, even if the identifier is deleted and some trivial perturbation operations are applied to ratings before they are released, there are research results claiming that the adversary could discriminate the individual's identity with a little bit of information. In this paper, we propose k-coRating, a novel privacy-preserving model, to retain data privacy by replacing some null ratings with "well-predicted" scores. They do not only mask the original ratings such that a k-anonymity-like data privacy is preserved, but also enhance the data utility (measured by prediction accuracy in this paper), which shows that the traditional assumption that accuracy and privacy are two goals in conflict is not necessarily correct. We show that the optimal k-coRated mapping is an NP-hard problem and design a naive but efficient algorithm to achieve k-coRating. All claims are verified by experimental results.

[1]  Paolo Avesani,et al.  Trust-aware recommender systems , 2007, RecSys '07.

[2]  Chris Clifton,et al.  Privacy Preserving Data Mining (Advances in Information Security) , 2005 .

[3]  Benjamin C. M. Fung,et al.  Secure Two-Party Differentially Private Data Release for Vertically Partitioned Data , 2014, IEEE Transactions on Dependable and Secure Computing.

[4]  Martin Ester,et al.  TrustWalker: a random walk model for combining trust-based and item-based recommendation , 2009, KDD.

[5]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[6]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[7]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[8]  John F. Canny,et al.  Collaborative filtering with privacy , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[9]  Charles Elkan,et al.  Differential privacy based on importance weighting , 2013, Machine Learning.

[10]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[11]  Licia Capra,et al.  Private distributed collaborative filtering using estimated concordance measures , 2007, RecSys '07.

[12]  Chein-Shung Hwang,et al.  Using Trust in Collaborative Filtering Recommendation , 2007, IEA/AIE.

[13]  Jennifer Golbeck,et al.  Computing and Applying Trust in Web-based Social Networks , 2005 .

[14]  Chris Clifton,et al.  Differential identifiability , 2012, KDD.

[15]  Tsvi Kuflik,et al.  Enhancing privacy and preserving accuracy of a distributed collaborative filtering , 2007, RecSys '07.

[16]  Jaideep Vaidya,et al.  Perturbation Based Privacy Preserving Slope One Predictors for Collaborative Filtering , 2012, IFIPTM.

[17]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[18]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[19]  Ninghui Li,et al.  On the tradeoff between privacy and utility in data publishing , 2009, KDD.

[20]  Wenliang Du,et al.  Privacy-preserving collaborative filtering using randomized perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[21]  Viggo Kann,et al.  Maximum Bounded H-Matching is MAX SNP-Complete , 1994, Inf. Process. Lett..

[22]  John F. Canny,et al.  Collaborative filtering with privacy via factor analysis , 2002, SIGIR '02.

[23]  Philip S. Yu,et al.  Differentially private data release for data mining , 2011, KDD.

[24]  Adam Tauman Kalai,et al.  Trust-based recommendation systems: an axiomatic approach , 2008, WWW.

[25]  Vitaly Shmatikov,et al.  The cost of privacy: destruction of data-mining utility in anonymized data publishing , 2008, KDD.

[26]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[27]  Sofya Raskhodnikova,et al.  Analyzing Graphs with Node Differential Privacy , 2013, TCC.

[28]  Tsan-sheng Hsu,et al.  Privacy-Preserving Collaborative Recommender Systems , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).