Training Provably Robust Models by Polyhedral Envelope Regularization.

Training certifiable neural networks enables one to obtain models with robustness guarantees against adversarial attacks. In this work, we introduce a framework to bound the adversary-free region in the neighborhood of the input data by a polyhedral envelope, which yields finer-grained certified robustness. We further introduce polyhedral envelope regularization (PER) to encourage larger polyhedral envelopes and thus improve the provable robustness of the models. We demonstrate the flexibility and effectiveness of our framework on standard benchmarks; it applies to networks of different architectures and general activation functions. Compared with the state-of-the-art methods, PER has very little computational overhead and better robustness guarantees without over-regularizing the model.

[1]  Yizheng Chen,et al.  MixTrain: Scalable Training of Formally Robust Neural Networks , 2018, ArXiv.

[2]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[3]  Kimin Lee,et al.  Using Pre-Training Can Improve Model Robustness and Uncertainty , 2019, ICML.

[4]  Cho-Jui Hsieh,et al.  A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks , 2019, NeurIPS.

[5]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[6]  Timothy A. Mann,et al.  On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[7]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[8]  Matthias Hein,et al.  Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks , 2020, ICML.

[9]  Volkan Cevher,et al.  On Certifying Non-uniform Bound against Adversarial Attacks , 2019, ICML.

[10]  Ryota Tomioka,et al.  In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.

[11]  Vineeth N. Balasubramanian,et al.  Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models , 2019, IJCAI.

[12]  Mislav Balunovic,et al.  Adversarial Training and Provable Defenses: Bridging the Gap , 2020, ICLR.

[13]  Richard Socher,et al.  A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation , 2018, ICLR.

[14]  Aditi Raghunathan,et al.  Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[15]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[16]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[17]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[19]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[20]  Baishakhi Ray,et al.  Metric Learning for Adversarial Robustness , 2019, NeurIPS.

[21]  Cho-Jui Hsieh,et al.  Towards Stable and Efficient Training of Verifiably Robust Neural Networks , 2019, ICLR.

[22]  Inderjit S. Dhillon,et al.  Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[23]  Russ Tedrake,et al.  Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[24]  Matthias Hein,et al.  Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack , 2019, ICML.

[25]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[26]  Po-Sen Huang,et al.  Are Labels Required for Improving Adversarial Robustness? , 2019, NeurIPS.

[27]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[28]  Ning Chen,et al.  Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness , 2019, ICLR.

[29]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[30]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[32]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[33]  J. Zico Kolter,et al.  Scaling provable adversarial defenses , 2018, NeurIPS.

[34]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[35]  Aleksander Madry,et al.  On Adaptive Attacks to Adversarial Example Defenses , 2020, NeurIPS.

[36]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[37]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[38]  Peilin Zhong,et al.  Enhancing Adversarial Defense by k-Winners-Take-All , 2020, ICLR.

[39]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[40]  Junfeng Yang,et al.  Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[41]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[42]  Nicolas Flammarion,et al.  Square Attack: a query-efficient black-box adversarial attack via random search , 2020, ECCV.

[43]  Adversarial Examples THERMOMETER ENCODING: ONE HOT WAY TO RESIST , 2017 .

[44]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[45]  Bin Dong,et al.  You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle , 2019, NeurIPS.

[46]  Cho-Jui Hsieh,et al.  Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[47]  Aleksander Madry,et al.  Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability , 2018, ICLR.

[48]  Tao Lin,et al.  On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them , 2020, NeurIPS.

[49]  Jinfeng Yi,et al.  Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[50]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[51]  LOSS LANDSCAPE MATTERS: TRAINING CERTIFIABLY ROBUST MODELS WITH FAVORABLE LOSS LAND- , 2020 .

[52]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[53]  Timothy A. Mann,et al.  Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples , 2020, ArXiv.

[54]  Matthew Mirman,et al.  Fast and Effective Robustness Certification , 2018, NeurIPS.

[55]  Yisen Wang,et al.  Adversarial Weight Perturbation Helps Robust Generalization , 2020, NeurIPS.

[56]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[57]  Nathan Srebro,et al.  Exploring Generalization in Deep Learning , 2017, NIPS.

[58]  Ning Chen,et al.  Improving Adversarial Robustness via Promoting Ensemble Diversity , 2019, ICML.

[59]  Dawn Xiaodong Song,et al.  Decision Boundary Analysis of Adversarial Examples , 2018, ICLR.

[60]  Matthias Hein,et al.  Provable Robustness of ReLU networks via Maximization of Linear Regions , 2018, AISTATS.