Matrix Factorization without User Data Retention

Recommender systems often rely on a centralized storage of user data and there is a growing concern about the misuse of the data. As recent studies have demonstrated, sensitive information could be inferred from user ratings stored by recommender systems. This work presents a novel semi-decentralized variant of the widely-used Matrix Factorization (MF) technique. The proposed approach eliminates the need for retaining user data, such that neither user ratings nor latent user vectors are stored by the recommender. Experimental evaluation indicates that the performance of the proposed approach is close to that of standard MF, and that the gap between the two diminishes as more user data becomes available. Our work paves the way to a new type of MF recommenders, which avoid user data retention, yet are capable of achieving accuracy similar to that of the state-of-the-art recommenders.

[1]  Tsvi Kuflik,et al.  Enhancing privacy and preserving accuracy of a distributed collaborative filtering , 2007, RecSys '07.

[2]  Daniel Lemire,et al.  Slope One Predictors for Online Rating-Based Collaborative Filtering , 2007, SDM.

[3]  Qingsheng Zhu,et al.  Incremental Collaborative Filtering recommender based on Regularized Matrix Factorization , 2012, Knowl. Based Syst..

[4]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[5]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[6]  Lars Schmidt-Thieme,et al.  Online-updating regularized kernel matrix factorization models for large-scale recommender systems , 2008, RecSys '08.

[7]  John Riedl,et al.  Do You Trust Your Recommendations? An Exploration of Security and Privacy Issues in Recommender Systems , 2006, ETRICS.

[8]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[9]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[10]  Saikat Guha,et al.  Serving Ads from localhost for Performance, Privacy, and Profit , 2009, HotNets.

[11]  Stratis Ioannidis,et al.  BlurMe: inferring and obfuscating user gender based on ratings , 2012, RecSys.

[12]  Murphy J. Stephen,et al.  You Might Also Like , 2014 .

[13]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[14]  Iván Cantador,et al.  Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols , 2013, User Modeling and User-Adapted Interaction.

[15]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[16]  Guy Shani,et al.  Evaluating Recommendation Systems , 2011, Recommender Systems Handbook.

[17]  Mohamed Ali Kâafar,et al.  You are what you like! Information leakage through users' Interests , 2012, NDSS.

[18]  Domonkos Tikk,et al.  Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..

[19]  Helen Nissenbaum,et al.  Adnostic: Privacy Preserving Targeted Advertising , 2010, NDSS.

[20]  Günter Müller Emerging Trends in Information and Communication Security , 2006, Lecture Notes in Computer Science.