论文信息 - Enhancing Gradient-based Attacks with Symbolic Intervals - 字舞流文

Enhancing Gradient-based Attacks with Symbolic Intervals

Recent breakthroughs in defenses against adversarial examples, like adversarial training, make the neural networks robust against various classes of attackers (e.g., first-order gradient-based attacks). However, it is an open question whether the adversarially trained networks are truly robust under unknown attacks. In this paper, we present interval attacks, a new technique to find adversarial examples to evaluate the robustness of neural networks. Interval attacks leverage symbolic interval propagation, a bound propagation technique that can exploit a broader view around the current input to locate promising areas containing adversarial instances, which in turn can be searched with existing gradient-guided attacks. We can obtain such a broader view using sound bound propagation methods to track and over-approximate the errors of the network within given input ranges. Our results show that, on state-of-the-art adversarially trained networks, interval attack can find on average 47% relatively more violations than the state-of-the-art gradient-guided PGD attack.

Yizheng Chen | Ahmed Abdou | Suman Jana | Shiqi Wang

[1] Yang Song,et al. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[2] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[3] Moustapha Cissé,et al. Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[4] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[5] Suman Jana,et al. DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[6] Patrick D. McDaniel,et al. Extending Defensive Distillation , 2017, ArXiv.

[7] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[8] Dawn Xiaodong Song,et al. Decision Boundary Analysis of Adversarial Examples , 2018, ICLR.

[9] Adversarial Examples. THERMOMETER ENCODING: ONE HOT WAY TO RESIST , 2017 .

[10] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[11] Alan L. Yuille,et al. Mitigating adversarial effects through randomization , 2017, ICLR.

[12] Moustapha Cissé,et al. Countering Adversarial Images using Input Transformations , 2018, ICLR.

[13] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Junfeng Yang,et al. DeepXplore , 2019, Commun. ACM.

[15] Valentina Zantedeschi,et al. Efficient Defenses Against Adversarial Attacks , 2017, AISec@CCS.

[16] J. Zico Kolter,et al. Scaling provable adversarial defenses , 2018, NeurIPS.

[17] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[18] Matthew Mirman,et al. Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[19] James Bailey,et al. Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[20] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[21] Patrick D. McDaniel,et al. Cleverhans V0.1: an Adversarial Machine Learning Library , 2016, ArXiv.

[22] Dawn Xiaodong Song,et al. Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong , 2017, ArXiv.

[23] Xiaoyu Cao,et al. Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.

[24] Luca Rigazio,et al. Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[25] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[26] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[27] Junfeng Yang,et al. Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[28] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[29] Jascha Sohl-Dickstein,et al. Adversarial Examples that Fool both Computer Vision and Time-Limited Humans , 2018, NeurIPS.

[30] Ian J. Goodfellow,et al. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[31] Patrick D. McDaniel,et al. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[32] Yizheng Chen,et al. MixTrain: Scalable Training of Formally Robust Neural Networks , 2018, ArXiv.

[33] Swarat Chaudhuri,et al. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[34] Colin Raffel,et al. Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[35] Junfeng Yang,et al. Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[36] David A. Wagner,et al. MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[37] Timothy A. Mann,et al. On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[38] Jascha Sohl-Dickstein,et al. Adversarial Examples that Fool both Human and Computer Vision , 2018, ArXiv.