Certified Defenses: Why Tighter Relaxations May Hurt Training?

Certified defenses based on convex relaxations are an established technique for training provably robust models. The key component is the choice of relaxation, varying from simple intervals to tight polyhedra. Paradoxically, however, it was empirically observed that training with tighter relaxations can worsen certified robustness. While several methods were designed to partially mitigate this issue, the underlying causes are poorly understood. In this work we investigate the above phenomenon and show that tightness may not be the determining factor for reduced certified robustness. Concretely, we identify two key features of relaxations that impact training dynamics: continuity and sensitivity. We then experimentally demonstrate that these two factors explain the drop in certified robustness when using popular relaxations. Further, we show, for the first time, that it is possible to successfully train with tighter relaxations (i.e., triangle), a result supported by our two properties. Overall, we believe the insights of this work can help drive the systematic discovery of new effective certified defenses.

[1]  J. Zico Kolter,et al.  Scaling provable adversarial defenses , 2018, NeurIPS.

[2]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[3]  Matthew Mirman,et al.  Universal Approximation with Certified Networks , 2020, ICLR.

[4]  Junfeng Yang,et al.  Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[5]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[6]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[7]  Panos M. Pardalos,et al.  Quadratic programming with one negative eigenvalue is NP-hard , 1991, J. Glob. Optim..

[8]  Aditi Raghunathan,et al.  Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[9]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10]  Mislav Balunovic,et al.  Adversarial Training and Provable Defenses: Bridging the Gap , 2020, ICLR.

[11]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[12]  Ali Mansourian,et al.  Rational function optimization using genetic algorithms , 2007, Int. J. Appl. Earth Obs. Geoinformation.

[13]  LOSS LANDSCAPE MATTERS: TRAINING CERTIFIABLY ROBUST MODELS WITH FAVORABLE LOSS LAND- , 2020 .

[14]  Martin Vechev,et al.  Learning Certified Individually Fair Representations , 2020, NeurIPS.

[15]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[16]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[17]  Pushmeet Kohli,et al.  A Dual Approach to Verify and Train Deep Networks , 2019, IJCAI.

[18]  Stephen P. Boyd,et al.  Differentiable Convex Optimization Layers , 2019, NeurIPS.

[19]  Caterina Urban,et al.  Perfectly parallel fairness certification of neural networks , 2020, Proc. ACM Program. Lang..

[20]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[21]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[22]  Juan Pablo Vielma,et al.  The Convex Relaxation Barrier, Revisited: Tightened Single-Neuron Relaxations for Neural Network Verification , 2020, NeurIPS.

[23]  José Mario Martínez Minimization of Discontinuous Cost Functions by Smoothing , 2002 .

[24]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[25]  Paul I. Barton,et al.  Global optimization of bounded factorable functions with discontinuities , 2013, J. Glob. Optim..

[26]  Martin Vechev,et al.  Beyond the Single Neuron Convex Barrier for Neural Network Certification , 2019, NeurIPS.

[27]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[28]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[29]  Etienne de Klerk,et al.  Global optimization of rational functions: a semidefinite programming approach , 2006, Math. Program..

[30]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[31]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[32]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[33]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[34]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[35]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[36]  Rüdiger Ehlers,et al.  Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks , 2017, ATVA.

[37]  Milind Tambe,et al.  MIPaaL: Mixed Integer Program as a Layer , 2019, AAAI.

[38]  Junfeng Yang,et al.  Formal Security Analysis of Neural Networks using Symbolic Intervals , 2018, USENIX Security Symposium.

[39]  Matthew Mirman,et al.  Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[40]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[41]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[42]  Cho-Jui Hsieh,et al.  A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks , 2019, NeurIPS.

[43]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[44]  Cho-Jui Hsieh,et al.  Towards Stable and Efficient Training of Verifiably Robust Neural Networks , 2019, ICLR.

[45]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[46]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[47]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.