FADER: Fast Adversarial Example Rejection

Deep neural networks are vulnerable to adversarial examples, i.e., carefully-crafted inputs that mislead classification at test time. Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from legitimate training samples at different layer representations - a behavior normally exhibited by adversarial attacks. Despite technical differences, all aforementioned methods share a common backbone structure that we formalize and highlight in this contribution, as it can help in identifying promising research directions and drawbacks of existing methods. The first main contribution of this work is the review of these detection methods in the form of a unifying framework designed to accommodate both existing defenses and newer ones to come. In terms of drawbacks, the overmentioned defenses require comparing input samples against an oversized number of reference prototypes, possibly at different representation layers, dramatically worsening the test-time efficiency. Besides, such defenses are typically based on ensembling classifiers with heuristic methods, rather than optimizing the whole architecture in an end-to-end manner to better perform detection. As a second main contribution of this work, we introduce FADER, a novel technique for speeding up detection-based methods. FADER overcome the issues above by employing RBF networks as detectors: by fixing the number of required prototypes, the runtime complexity of adversarial examples detectors can be controlled. Our experiments outline up to 73x prototypes reduction compared to analyzed detectors for MNIST dataset and up to 50x for CIFAR10 dataset respectively, without sacrificing classification accuracy on both clean and adversarial data.

[1]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2018, CCS.

[2]  David A. Forsyth,et al.  SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  J. Doug Tygar,et al.  Evasion and Hardening of Tree Ensemble Classifiers , 2015, ICML.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[6]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[7]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[8]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[9]  Fabio Roli,et al.  Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[10]  Suvrit Sra,et al.  Deep-RBF Networks Revisited: Robust Classification with Rejection , 2018, ArXiv.

[11]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[12]  Fabio Roli,et al.  Do gradient-based explanations tell anything about adversarial robustness to android malware? , 2020, International Journal of Machine Learning and Cybernetics.

[13]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[14]  Bernhard Schölkopf,et al.  First-Order Adversarial Vulnerability of Neural Networks and Input Dimension , 2018, ICML.

[15]  Ioannis Mitliagkas,et al.  Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations , 2018, ArXiv.

[16]  Fabio Roli,et al.  Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2017, Pattern Recognit..

[17]  Alexander Gammerman,et al.  Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.

[18]  Fabio Roli,et al.  Secure Kernel Machines against Evasion Attacks , 2016, AISec@CCS.

[19]  George Hsieh,et al.  Radial Basis Function Network: Its Robustness and Ability to Mitigate Adversarial Examples , 2019, 2019 International Conference on Computational Science and Computational Intelligence (CSCI).

[20]  Daniel Gibert,et al.  The rise of machine learning for detection and classification of malware: Research developments, trends and challenges , 2020, J. Netw. Comput. Appl..

[21]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[22]  Sanjay Chawla,et al.  Mining adversarial patterns via regularized loss minimization , 2010, Machine Learning.

[23]  Fabio Roli,et al.  Multiple classifier systems for robust classifier design in adversarial environments , 2010, Int. J. Mach. Learn. Cybern..

[24]  George Cybenko,et al.  Security Analytics and Measurements , 2012, IEEE Security & Privacy.

[25]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[26]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[27]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[29]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[30]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[31]  Tobias Scheffer,et al.  Bayesian Games for Adversarial Regression Problems , 2013, ICML.

[32]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, IEEE Transactions on Knowledge and Data Engineering.

[33]  Isabelle Hupont Torres,et al.  Facial recognition application for border control , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[34]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[35]  Alexander Gammerman,et al.  Transduction with Confidence and Credibility , 1999, IJCAI.

[36]  D. Bacciu,et al.  Detecting Adversarial Examples through Nonlinear Dimensionality Reduction. , 2019 .

[37]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[38]  Tobias Scheffer,et al.  Static prediction games for adversarial learning problems , 2012, J. Mach. Learn. Res..

[39]  Fabio Roli,et al.  Yes, Machine Learning Can Be More Secure! A Case Study on Android Malware Detection , 2017, IEEE Transactions on Dependable and Secure Computing.

[40]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[41]  Michael Wooldridge,et al.  Does Game Theory Work? , 2012, IEEE Intelligent Systems.

[42]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[43]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML '08.

[44]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[45]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[46]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[47]  Aleksander Kolcz,et al.  Feature Weighting for Improved Classifier Robustness , 2009, CEAS 2009.

[48]  Fabio Roli,et al.  Fast Image Classification with Reduced Multiclass Support Vector Machines , 2015, ICIAP.

[49]  Anderson Rocha,et al.  Meta-Recognition: The Theory and Practice of Recognition Score Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Patrick D. McDaniel,et al.  Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[51]  Fabio Roli,et al.  Deep neural rejection against adversarial examples , 2020, EURASIP J. Inf. Secur..

[52]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[53]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[54]  Alexander J. Smola,et al.  Convex Learning with Invariances , 2007, NIPS.

[55]  Luca de Alfaro Neural Networks with Structural Resistance to Adversarial Attacks , 2018, ArXiv.

[56]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[57]  Badal Soni,et al.  Breast cancer detection by leveraging Machine Learning , 2020, ICT Express.

[58]  Gavin Brown,et al.  Is Deep Learning Safe for Robot Vision? Adversarial Examples Against the iCub Humanoid , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[59]  Fabio Roli,et al.  Randomized Prediction Games for Adversarial Machine Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.