Data Poisoning Attacks on Factorization-Based Collaborative Filtering

Recommendation and collaborative filtering systems are important in modern information and e-commerce applications. As these systems are becoming increasingly popular in the industry, their outputs could affect business decision making, introducing incentives for an adversarial party to compromise the availability or integrity of such systems. We introduce a data poisoning attack on collaborative filtering systems. We demonstrate how a powerful attacker with full knowledge of the learner can generate malicious data so as to maximize his/her malicious objectives, while at the same time mimicking normal user behavior to avoid being detected. While the complete knowledge assumption seems extreme, it enables a robust assessment of the vulnerability of collaborative filtering schemes to highly motivated attacks. We present efficient solutions for two popular factorization-based collaborative filtering algorithms: the \emph{alternative minimization} formulation and the \emph{nuclear norm minimization} method. Finally, we test the effectiveness of our proposed algorithms on real-world data and discuss potential defensive strategies.

[1]  Ali Jalali,et al.  Low-rank matrix recovery from errors and erasures , 2011, ISIT.

[2]  Yevgeniy Vorobeychik,et al.  Scalable Optimization of Randomized Operational Decisions in Adversarial Classification Settings , 2015, AISTATS.

[3]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[4]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[5]  Claudia Eckert,et al.  Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[6]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[7]  Paul Barford,et al.  Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.

[8]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[9]  Yu-Xiang Wang,et al.  Stability of matrix factorization for collaborative filtering , 2012, ICML.

[10]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[11]  A. Tsybakov,et al.  Robust matrix completion , 2014, Probability Theory and Related Fields.

[12]  Constantine Caramanis,et al.  Robust Matrix Completion and Corrupted Columns , 2011, ICML.

[13]  Neil J. Hurley,et al.  Promoting Recommendations: An Attack on Collaborative Filtering , 2002, DEXA.

[14]  Feiping Nie,et al.  Robust Matrix Completion via Joint Schatten p-Norm and lp-Norm Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[15]  Yevgeniy Vorobeychik,et al.  Feature Cross-Substitution in Adversarial Classification , 2014, NIPS.

[16]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[17]  Xiaojin Zhu,et al.  The Security of Latent Dirichlet Allocation , 2015, AISTATS.

[18]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[19]  Jun Wang,et al.  Unifying user-based and item-based collaborative filtering approaches by similarity fusion , 2006, SIGIR.

[20]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[21]  Robin Burke,et al.  Effective Attack Models for Shilling Item-Based Collaborative Filtering Systems , 2005 .

[22]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.