Enhancing Gradient-based Attacks with Symbolic Intervals

Recent breakthroughs in defenses against adversarial examples, like adversarial training, make the neural networks robust against various classes of attackers (e.g., first-order gradient-based attacks). However, it is an open question whether the adversarially trained networks are truly robust under unknown attacks. In this paper, we present interval attacks, a new technique to find adversarial examples to evaluate the robustness of neural networks. Interval attacks leverage symbolic interval propagation, a bound propagation technique that can exploit a broader view around the current input to locate promising areas containing adversarial instances, which in turn can be searched with existing gradient-guided attacks. We can obtain such a broader view using sound bound propagation methods to track and over-approximate the errors of the network within given input ranges. Our results show that, on state-of-the-art adversarially trained networks, interval attack can find on average 47% relatively more violations than the state-of-the-art gradient-guided PGD attack.

[1]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[2]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[3]  Moustapha Cissé,et al.  Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[4]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[5]  Suman Jana,et al.  DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[6]  Patrick D. McDaniel,et al.  Extending Defensive Distillation , 2017, ArXiv.

[7]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[8]  Dawn Xiaodong Song,et al.  Decision Boundary Analysis of Adversarial Examples , 2018, ICLR.

[9]  Adversarial Examples THERMOMETER ENCODING: ONE HOT WAY TO RESIST , 2017 .

[10]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[11]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[12]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[13]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Junfeng Yang,et al.  DeepXplore , 2019, Commun. ACM.

[15]  Valentina Zantedeschi,et al.  Efficient Defenses Against Adversarial Attacks , 2017, AISec@CCS.

[16]  J. Zico Kolter,et al.  Scaling provable adversarial defenses , 2018, NeurIPS.

[17]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[18]  Matthew Mirman,et al.  Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[19]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[20]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[21]  Patrick D. McDaniel,et al.  Cleverhans V0.1: an Adversarial Machine Learning Library , 2016, ArXiv.

[22]  Dawn Xiaodong Song,et al.  Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong , 2017, ArXiv.

[23]  Xiaoyu Cao,et al.  Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.

[24]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[25]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[26]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[27]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[28]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[29]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Computer Vision and Time-Limited Humans , 2018, NeurIPS.

[30]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[31]  Patrick D. McDaniel,et al.  Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[32]  Yizheng Chen,et al.  MixTrain: Scalable Training of Formally Robust Neural Networks , 2018, ArXiv.

[33]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[34]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[35]  Junfeng Yang,et al.  Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[36]  David A. Wagner,et al.  MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[37]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[38]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Human and Computer Vision , 2018, ArXiv.