A Practical System for Privacy-Preserving Collaborative Filtering

Collaborative filtering is a widely-used technique in online services to enhance the accuracy of a recommender system. This technique, however, comes at the cost of users having to reveal their preferences, which has undesirable privacy implications. We propose a collaborative filtering system where the system does not observe the users' data and is still able to provide useful recommendations. Compared to prior systems, our emphasis is on building a practical system that can be reasonably used by a large number of users. Our approach involves creating a primitive to cluster similar users privately by modifying existing methods such as Locality Sensitive Hashing. Another technique we use is artificial ratings, as part of the process of privately predicting the rating for an item within a particular cluster. We evaluate our scheme on the Netflix Prize dataset, reporting the accuracy of our recommendations as a function of the privacy provided.

[1]  John F. Canny,et al.  Collaborative filtering with privacy via factor analysis , 2002, SIGIR '02.

[2]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[3]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[4]  Wenliang Du,et al.  Privacy-preserving collaborative filtering using randomized perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[5]  Wenliang Du,et al.  Achieving Private Recommendations Using Randomized Response Techniques , 2006, PAKDD.

[6]  Wenliang Du,et al.  Deriving private information from randomized data , 2005, SIGMOD '05.

[7]  Fillia Makedon,et al.  Deriving Private Information from Randomly Perturbed Ratings , 2006, SDM.

[8]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[9]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[10]  Jordi Forné,et al.  A Privacy-Protecting Architecture for Collaborative Filtering via Forgery and Suppression of Ratings , 2011, DPM/SETOP.

[11]  John F. Canny,et al.  Collaborative filtering with privacy , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[12]  Animesh Nandi,et al.  P3: A Privacy Preserving Personalization Middleware for recommendation-based services , 2011 .

[13]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[14]  Nathaniel E. Helwig,et al.  An Introduction to Linear Algebra , 2006 .

[15]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[16]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[17]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[18]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[19]  James Bennett,et al.  The Netflix Prize , 2007 .

[20]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.