论文信息 - Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability - 字舞流文

Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability

We explore the concept of co-design in the context of neural network verification. Specifically, we aim to train deep neural networks that not only are robust to adversarial perturbations but also whose robustness can be verified more easily. To this end, we identify two properties of network models - weight sparsity and so-called ReLU stability - that turn out to significantly impact the complexity of the corresponding verification task. We demonstrate that improving weight sparsity alone already enables us to turn computationally intractable verification problems into tractable ones. Then, improving ReLU stability leads to an additional 4-13x speedup in verification times. An important feature of our methodology is its "universality," in the sense that it can be used with a broad range of training procedures and verification approaches.

Aleksander Madry | Kai Y. Xiao | Vincent Tjeng | Nur Muhammad Shafiullah | A. Madry | Vincent Tjeng | Kai Y. Xiao | Nur Muhammad (Mahi) Shafiullah

[1] J. Zico Kolter,et al. Scaling provable adversarial defenses , 2018, NeurIPS.

[2] Alessio Lomuscio,et al. An approach to reachability analysis for feed-forward ReLU neural networks , 2017, ArXiv.

[3] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[4] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[5] Rüdiger Ehlers,et al. Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[6] Chih-Hong Cheng,et al. Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[7] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[8] Russ Tedrake,et al. Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[9] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[10] Pushmeet Kohli,et al. Training verified learners with learned verifiers , 2018, ArXiv.

[11] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[14] Aditi Raghunathan,et al. Certified Defenses against Adversarial Examples , 2018, ICLR.

[15] Tao Zhang,et al. A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[18] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[19] Atul Prakash,et al. Robust Physical-World Attacks on Machine Learning Models , 2017, ArXiv.

[20] David L. Dill,et al. Ground-Truth Adversarial Examples , 2017, ArXiv.

[21] John C. Duchi,et al. Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[22] Russ Tedrake,et al. Verifying Neural Networks with Mixed Integer Programming , 2017, ArXiv.

[23] Timothy A. Mann,et al. On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[24] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[25] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[27] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[28] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[29] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[31] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[32] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[33] Matthew Mirman,et al. Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[34] Harini Kannan,et al. Adversarial Logit Pairing , 2018, NIPS 2018.

[35] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).