Mitigation of Adversarial Attacks through Embedded Feature Selection

Machine learning has become one of the main components for task automation in many application domains. Despite the advancements and impressive achievements of machine learning, it has been shown that learning algorithms can be compromised by attackers both at training and test time. Machine learning systems are especially vulnerable to adversarial examples where small perturbations added to the original data points can produce incorrect or unexpected outputs in the learning algorithms at test time. Mitigation of these attacks is hard as adversarial examples are difficult to detect. Existing related work states that the security of machine learning systems against adversarial examples can be weakened when feature selection is applied to reduce the systems' complexity. In this paper, we empirically disprove this idea, showing that the relative distortion that the attacker has to introduce to succeed in the attack is greater when the target is using a reduced set of features. We also show that the minimal adversarial examples differ statistically more strongly from genuine examples with a lower number of features. However, reducing the feature count can negatively impact the system's performance. We illustrate the trade-off between security and accuracy with specific examples. We propose a design methodology to evaluate the security of machine learning classifiers with embedded feature selection against adversarial examples crafted using different attack strategies.

[1]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[2]  R. Fortet,et al.  Convergence de la répartition empirique vers la répartition théorique , 1953 .

[3]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[4]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[6]  Yang Song,et al.  Improving the Robustness of Deep Neural Networks via Stability Training , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Fabio Roli,et al.  Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization , 2017, AISec@CCS.

[8]  Wei Liu,et al.  On Sparse Feature Attacks in Adversarial Learning , 2014, ICDM.

[9]  Blaine Nelson,et al.  Exploiting Machine Learning to Subvert Your Spam Filter , 2008, LEET.

[10]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[11]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12]  Patrick P. K. Chan,et al.  Adversarial Feature Selection Against Evasion Attacks , 2016, IEEE Transactions on Cybernetics.

[13]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[14]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Luis Muñoz-González,et al.  The Secret of Machine Learning , 2018 .

[16]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[17]  Ling Huang,et al.  Near-Optimal Evasion of Convex-Inducing Classifiers , 2010, AISTATS.

[18]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[19]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[20]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, IEEE Transactions on Knowledge and Data Engineering.

[22]  Daniele Sgandurra,et al.  Automated Dynamic Analysis of Ransomware: Benefits, Limitations and use for Detection , 2016, ArXiv.

[23]  Eugenio Culurciello,et al.  Robust Convolutional Neural Networks under Adversarial Noise , 2015, ArXiv.

[24]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[25]  Chang Liu,et al.  Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[26]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[27]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[28]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[29]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[30]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[31]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[32]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Prateek Mittal,et al.  Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers , 2017, ArXiv.

[34]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[35]  Uri Shaham,et al.  Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization , 2015, ArXiv.

[36]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[37]  Blaine Nelson,et al.  Adversarial machine learning , 2019, AISec '11.

[38]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[39]  Patrick D. McDaniel,et al.  Machine Learning in Adversarial Settings , 2016, IEEE Security & Privacy.

[40]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[41]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[42]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[43]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[44]  Daniel Cullina,et al.  Enhancing robustness of machine learning systems via data transformations , 2017, 2018 52nd Annual Conference on Information Sciences and Systems (CISS).

[45]  Bernhard Schölkopf,et al.  Adversarial Vulnerability of Neural Networks Increases With Input Dimension , 2018, ArXiv.

[46]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[47]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[48]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.