Influence Function based Data Poisoning Attacks to Top-N Recommender Systems

Recommender system is an essential component of web services to engage users. Popular recommender systems model user preferences and item properties using a large amount of crowdsourced user-item interaction data, e.g., rating scores; then top-N items that match the best with a user’s preference are recommended to the user. In this work, we show that an attacker can launch a data poisoning attack to a recommender system to make recommendations as the attacker desires via injecting fake users with carefully crafted user-item interaction data. Specifically, an attacker can trick a recommender system to recommend a target item to as many normal users as possible. We focus on matrix factorization based recommender systems because they have been widely deployed in industry. Given the number of fake users the attacker can inject, we formulate the crafting of rating scores for the fake users as an optimization problem. However, this optimization problem is challenging to solve as it is a non-convex integer programming problem. To address the challenge, we develop several techniques to approximately solve the optimization problem. For instance, we leverage influence function to select a subset of normal users who are influential to the recommendations and solve our formulated optimization problem based on these influential users. Our results show that our attacks are effective and outperform existing methods.

[1]  John Riedl,et al.  Shilling recommender systems for fun and profit , 2004, WWW '04.

[2]  Binghui Wang,et al.  Stealing Hyperparameters in Machine Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[3]  Hanif D. Sherali,et al.  Linear programming and network flows (2nd ed.) , 1990 .

[4]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[5]  Ying Cai,et al.  Fake Co-visitation Injection Attacks to Recommender Systems , 2017, NDSS.

[6]  Zhengyuan Zhu,et al.  Private and communication-efficient edge learning: a sparse differential gaussian-masking distributed SGD approach , 2020, MobiHoc.

[7]  Nick Feamster,et al.  Take This Personally: Pollution Attacks on Personalized Services , 2013, USENIX Security Symposium.

[8]  Kannan Ramchandran,et al.  Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[9]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[10]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[11]  Bamshad Mobasher,et al.  Towards Trustworthy Recommender Systems : An Analysis of Attack Models and Algorithm Robustness , 2007 .

[12]  Wolfgang Nejdl,et al.  Preventing shilling attacks in online recommender systems , 2005, WIDM '05.

[13]  Chang Liu,et al.  Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[14]  Minghong Fang,et al.  Local Model Poisoning Attacks to Byzantine-Robust Federated Learning , 2019, USENIX Security Symposium.

[15]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[16]  Jia Liu,et al.  Byzantine-Resilient Stochastic Gradient Descent for Distributed Learning: A Lipschitz-Inspired Coordinate-wise Median Approach , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[17]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[18]  Konstantina Christakopoulou,et al.  Adversarial attacks on an oblivious recommender , 2019, RecSys.

[19]  Percy Liang,et al.  On the Accuracy of Influence Functions for Measuring Group Effects , 2019, NeurIPS.

[20]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[21]  Bhaskar Mehta,et al.  Attack resistant collaborative filtering , 2008, SIGIR '08.

[22]  S. Weisberg,et al.  Characterizations of an Empirical Influence Function for Detecting Influential Cases in Regression , 1980 .

[23]  Robin Burke,et al.  Effective Attack Models for Shilling Item-Based Collaborative Filtering Systems , 2005 .

[24]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[25]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[26]  Hanif D. Sherali,et al.  Linear Programming and Network Flows: Bazaraa/Linear , 2009 .

[27]  Claudia Eckert,et al.  Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[28]  Yevgeniy Vorobeychik,et al.  Data Poisoning Attacks on Factorization-Based Collaborative Filtering , 2016, NIPS.

[29]  Ling Huang,et al.  ANTIDOTE: understanding and defending against poisoning of anomaly detectors , 2009, IMC '09.

[30]  Bo Li,et al.  Data Dropout: Optimizing Training Data for Convolutional Neural Networks , 2018, 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI).

[31]  Antonio Torralba,et al.  Are all training examples equally valuable? , 2013, ArXiv.

[32]  David C. Wilson,et al.  When power users attack: assessing impacts in collaborative recommender systems , 2013, RecSys.

[33]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[34]  Claudia Eckert,et al.  Adversarial Label Flips Attack on Support Vector Machines , 2012, ECAI.

[35]  David C. Wilson,et al.  Attacking item-based recommender systems with power items , 2014, RecSys '14.

[36]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[37]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[38]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[39]  Binghui Wang,et al.  Attacking Graph-based Classification via Manipulating the Graph Structure , 2019, CCS.

[40]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[41]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[42]  Reza Shokri,et al.  Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks , 2018, ArXiv.

[43]  Stephan Günnemann,et al.  Adversarial Attacks on Neural Networks for Graph Data , 2018, KDD.

[44]  Jia Liu,et al.  Poisoning Attacks to Graph-Based Recommender Systems , 2018, ACSAC.