论文信息 - On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models - 字舞流文

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations. Most of these methods are based on minimizing an upper bound on the worst-case loss over all possible adversarial perturbations. While these techniques show promise, they often result in difficult optimization procedures that remain hard to scale to larger networks. Through a comprehensive analysis, we show how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy. While the upper bound computed by IBP can be quite weak for general networks, we demonstrate that an appropriate loss and clever hyper-parameter schedule allow the network to adapt such that the IBP bound is tight. This results in a fast and stable learning algorithm that outperforms more sophisticated methods and achieves state-of-the-art results on MNIST, CIFAR-10 and SVHN. It also allows us to train the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.

Timothy A. Mann | Jonathan Uesato | Pushmeet Kohli | Krishnamurthy Dvijotham | R. Arandjelović | Sven Gowal | Robert Stanforth | Rudy Bunel | Chongli Qin | J. Uesato | Relja Arandjelović

[1] T. Sunaga. Theory of an interval algebra and its application to numerical analysis , 2009 .

[2] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[5] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[6] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[7] Rüdiger Ehlers,et al. Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[8] Pushmeet Kohli,et al. Piecewise Linear Neural Network verification: A comparative study , 2017, ArXiv.

[9] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[10] Chih-Hong Cheng,et al. Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[11] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[12] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[13] David L. Dill,et al. Ground-Truth Adversarial Examples , 2017, ArXiv.

[14] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[15] Russ Tedrake,et al. Verifying Neural Networks with Mixed Integer Programming , 2017, ArXiv.

[16] J. Zico Kolter,et al. Scaling provable adversarial defenses , 2018, NeurIPS.

[17] Junfeng Yang,et al. Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[18] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[19] Pushmeet Kohli,et al. Training verified learners with learned verifiers , 2018, ArXiv.

[20] Yizheng Chen,et al. MixTrain: Scalable Training of Formally Robust Neural Networks , 2018, ArXiv.

[21] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[22] Matthew Mirman,et al. Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[23] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[24] Pushmeet Kohli,et al. A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[25] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[26] Aditi Raghunathan,et al. Certified Defenses against Adversarial Examples , 2018, ICLR.

[27] Inderjit S. Dhillon,et al. Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[28] Swarat Chaudhuri,et al. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[29] Harini Kannan,et al. Adversarial Logit Pairing , 2018, NIPS 2018.

[30] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[31] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[32] Russ Tedrake,et al. Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[33] Aleksander Madry,et al. Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability , 2018, ICLR.