Poisoning Attacks with Generative Adversarial Nets

Machine learning algorithms are vulnerable to poisoning attacks: An adversary can inject malicious points in the training dataset to influence the learning process and degrade its performance. Optimal poisoning attacks have already been proposed to evaluate worst-case scenarios, modelling attacks as a bi-level optimisation problem. Solving these problems is computationally demanding and has limited applicability for some models such as deep networks. In this paper we introduce a novel generative model to craft systematic poisoning attacks against machine learning classifiers generating adversarial training examples, i.e. samples that look like genuine data points but that degrade the classifier's accuracy when used for training. We propose a Generative Adversarial Net with three components: generator, discriminator, and the target classifier. This approach allows us to model naturally the detectability constrains that can be expected in realistic attacks and to identify the regions of the underlying data distribution that can be more vulnerable to data poisoning. Our experimental evaluation shows the effectiveness of our attack to compromise machine learning classifiers, including deep networks.

[1]  Luis Muñoz-González,et al.  Label Sanitization against Label Flipping Poisoning Attacks , 2018, Nemesis/UrbReas/SoGood/IWAISe/GDM@PKDD/ECML.

[2]  Marius Kloft,et al.  Security analysis of online centroid anomaly detection , 2010, J. Mach. Learn. Res..

[3]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[4]  Claudia Eckert,et al.  Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[5]  Chris Jermaine,et al.  Outlier detection by sampling with accuracy guarantees , 2006, KDD '06.

[6]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[7]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[8]  Blaine Nelson,et al.  The security of machine learning , 2010, Machine Learning.

[9]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[10]  Percy Liang,et al.  Stronger data poisoning attacks break data sanitization defenses , 2018, Machine Learning.

[11]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[12]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[13]  Chang Liu,et al.  Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  Yiran Chen,et al.  Generative Poisoning Attack Method Against Neural Networks , 2017, ArXiv.

[18]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[19]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[20]  Luis Muñoz-González,et al.  Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection , 2018, ArXiv.

[21]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[22]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[23]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[24]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[25]  Fabio Roli,et al.  Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization , 2017, AISec@CCS.

[26]  Luis Muñoz-González,et al.  Don't fool Me!: Detection, Characterisation and Diagnosis of Spoofed and Masked Events in Wireless Sensor Networks , 2017, IEEE Transactions on Dependable and Secure Computing.

[27]  Blaine Nelson,et al.  Adversarial machine learning , 2019, AISec '11.

[28]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[29]  Fabio Roli,et al.  Machine Learning Methods for Computer Security (Dagstuhl Perspectives Workshop 12371) , 2012, Dagstuhl Reports.