Attack Strength vs. Detectability Dilemma in Adversarial Machine Learning

As the prevalence and everyday use of machine learning algorithms, along with our reliance on these algorithms grow dramatically, so do the efforts to attack and undermine these algorithms with malicious intent, resulting in a growing interest in adversarial machine learning. A number of approaches have been developed that can render a machine learning algorithm ineffective through poisoning or other types of attacks. Most attack algorithms typically use sophisticated optimization approaches, whose objective function is designed to cause maximum damage to accuracy or performance of the algorithm with respect to some task. In this effort, we show that while such an objective function is indeed brutally effective in causing maximum damage on an embedded feature selection task, it often results in an attack mechanism that can be easily detected with an embarrassingly simple novelty or outlier detection algorithm. We then propose an equally simple yet elegant solution by adding a regularization term to the attacker's objective function that penalizes outlying attack points.

[1]  María de Lourdes Martínez-Villaseñor,et al.  A novel methodology for optimizing display advertising campaigns using genetic algorithms , 2018, Electron. Commer. Res. Appl..

[2]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[3]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[4]  Lars Schmidt-Thieme,et al.  Taxonomy-driven computation of product recommendations , 2004, CIKM '04.

[5]  Patrick D. McDaniel,et al.  Extending Defensive Distillation , 2017, ArXiv.

[6]  Fabio Roli,et al.  Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization , 2017, AISec@CCS.

[7]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[8]  Herbert Moskowitz,et al.  Modeling and optimizing a vendor managed replenishment system using machine learning and genetic algorithms , 2007, Eur. J. Oper. Res..

[9]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[10]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[11]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[12]  William M. Campbell,et al.  Recommender Systems for the Department of Defense and Intelligence Community , 2016 .

[13]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[14]  Dayne Freitag,et al.  A Machine Learning Architecture for Optimizing Web Search Engines , 1999 .

[15]  R. Berk Criminal Justice Forecasts of Risk: A Machine Learning Approach , 2012 .

[16]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[18]  Claudia Eckert,et al.  Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[19]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[20]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[21]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[22]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[23]  Christophe Clavier,et al.  Secret External Encodings Do Not Prevent Transient Fault Analysis , 2007, CHES.

[24]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[25]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[26]  Daniel Kifer,et al.  Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization , 2016, ArXiv.

[27]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[28]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[29]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[30]  Zhitao Gong,et al.  Adversarial and Clean Data Are Not Twins , 2017, aiDM@SIGMOD.

[31]  Xiaoyu Cao,et al.  Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.

[32]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[33]  Fabio Roli,et al.  Is data clustering in adversarial settings secure? , 2013, AISec.

[34]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2018, CCS.

[35]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.