Adversarial Training and Provable Robustness: A Tale of Two Objectives

We propose a principled framework that combines adversarial training and provable robustness verification for training certifiably robust neural networks. We formulate the training problem as a joint optimization problem with both empirical and provable robustness objectives and develop a novel gradient-descent technique that can eliminate bias in stochastic multi-gradients. We perform both theoretical analysis on the convergence of the proposed technique and experimental comparison with state-of-the-arts. Results on MNIST and CIFAR-10 show that our method can consistently match or outperform prior approaches for provable l infinity robustness. Notably, we achieve 6.60% verified test error on MNIST at epsilon = 0.3, and 66.57% on CIFAR-10 with epsilon = 8/255.

[1]  Pushmeet Kohli,et al.  A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[2]  Aditi Raghunathan,et al.  Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[3]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[4]  Yizheng Chen,et al.  MixTrain: Scalable Training of Verifiably Robust Neural Networks , 2018, 1811.02625.

[5]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[6]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[9]  Cho-Jui Hsieh,et al.  Towards Stable and Efficient Training of Verifiably Robust Neural Networks , 2019, ICLR.

[10]  J. Zico Kolter,et al.  Fast is better than free: Revisiting adversarial training , 2020, ICLR.

[11]  Timon Gehr,et al.  Boosting Robustness Certification of Neural Networks , 2018, ICLR.

[12]  Suyun Liu,et al.  The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning , 2019, Annals of Operations Research.

[13]  Barnabás Póczos,et al.  Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.

[14]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[15]  Zahra Rahimi Afzal,et al.  Abstraction based Output Range Analysis for Neural Networks , 2020, NeurIPS.

[16]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[17]  Aleksander Madry,et al.  Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability , 2018, ICLR.

[18]  Xiaowei Huang,et al.  Reachability Analysis of Deep Neural Networks with Provable Guarantees , 2018, IJCAI.

[19]  Vladlen Koltun,et al.  Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.

[20]  Greg Turk,et al.  Learning Novel Policies For Tasks , 2019, ICML.

[21]  Mislav Balunovic,et al.  Adversarial Training and Provable Defenses: Bridging the Gap , 2020, ICLR.

[22]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[23]  Matthew Mirman,et al.  Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[24]  Yizheng Chen,et al.  MixTrain: Scalable Training of Formally Robust Neural Networks , 2018, ArXiv.

[25]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[26]  Jiameng Fan,et al.  Towards Verification-Aware Knowledge Distillation for Neural-Network Controlled Systems: Invited Paper , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[27]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[28]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[29]  Liwei Wang,et al.  Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.

[30]  Jaeho Lee,et al.  Minimax Statistical Learning with Wasserstein distances , 2017, NeurIPS.

[31]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[32]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[33]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.