Recommendation with k-Anonymized Ratings

Recommender systems are widely used to predict personalized preferences of goods or services using users' past activities, such as item ratings or purchase histories. If collections of such personal activities were made publicly available, they could be used to personalize a diverse range of services, including targeted advertisement or recommendations. However, there would be an accompanying risk of privacy violations. The pioneering work of Narayanan et al.\ demonstrated that even if the identifiers are eliminated, the public release of user ratings can allow for the identification of users by those who have only a small amount of data on the users' past ratings. In this paper, we assume the following setting. A collector collects user ratings, then anonymizes and distributes them. A recommender constructs a recommender system based on the anonymized ratings provided by the collector. Based on this setting, we exhaustively list the models of recommender systems that use anonymized ratings. For each model, we then present an item-based collaborative filtering algorithm for making recommendations based on anonymized ratings. Our experimental results show that an item-based collaborative filtering based on anonymized ratings can perform better than collaborative filterings based on 5--10 non-anonymized ratings. This surprising result indicates that, in some settings, privacy protection does not necessarily reduce the usefulness of recommendations. From the experimental analysis of this counterintuitive result, we observed that the sparsity of the ratings can be reduced by anonymization and the variance of the prediction can be reduced if $k$, the anonymization parameter, is appropriately tuned. In this way, the predictive performance of recommendations based on anonymized ratings can be improved in some settings.

[1]  Wenliang Du,et al.  Privacy-preserving collaborative filtering using randomized perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[2]  Samir Khuller,et al.  Achieving anonymity via clustering , 2006, PODS '06.

[3]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[4]  David Rebollo Monedero,et al.  A privacy-protecting architecture for recommendation systems via the suppression of ratings , 2012 .

[5]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[6]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[7]  Jun-Lin Lin,et al.  An efficient clustering method for k-anonymization , 2008, PAIS '08.

[8]  John F. Canny,et al.  Collaborative filtering with privacy , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[9]  Arkadiusz Paterek,et al.  Improving regularized singular value decomposition for collaborative filtering , 2007 .

[10]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[11]  L. C. Smith Privacy-Preserving Collaborative Filtering Using Randomized Perturbation Techniques , 2013 .

[12]  Wendy Hui Wang,et al.  Towards publishing recommendation data with predictive anonymization , 2010, ASIACCS '10.

[13]  Douglas M. Blough,et al.  Privacy Preserving Collaborative Filtering Using Data Obfuscation , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[14]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[15]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).