Adversarial Robustness Through Local Lipschitzness

A standard method for improving the robustness of neural networks is adversarial training, where the network is trained on adversarial examples that are close to the training inputs. This produces classifiers that are robust, but it often decreases clean accuracy. Prior work even posits that the tradeoff between robustness and accuracy may be inevitable. We investigate this tradeoff in more depth through the lens of local Lipschitzness. In many image datasets, the classes are separated in the sense that images with different labels are not extremely close in $\ell_\infty$ distance. Using this separation as a starting point, we argue that it is possible to achieve both accuracy and robustness by encouraging the classifier to be locally smooth around the data. More precisely, we consider classifiers that are obtained by rounding locally Lipschitz functions. Theoretically, we show that such classifiers exist for any dataset such that there is a positive distance between the support of different classes. Empirically, we compare the local Lipschitzness of classifiers trained by several methods. Our results show that having a small Lipschitz constant correlates with achieving high clean and robust accuracy, and therefore, the smoothness of the classifier is an important property to consider in the context of adversarial examples. Code available at this https URL .

[1]  Rui Xu,et al.  When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Daniel Kifer,et al.  Unifying Adversarial Training Algorithms with Data Gradient Regularization , 2017, Neural Computation.

[3]  Aditi Raghunathan,et al.  Certified Defenses against Adversarial Examples , 2018, ICLR.

[4]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[5]  Lawrence Carin,et al.  Second-Order Adversarial Attack and Certifiable Robustness , 2018, ArXiv.

[6]  Gang Niu,et al.  Attacks Which Do Not Kill Training Make Adversarial Learning Stronger , 2020, ICML.

[7]  Kamalika Chaudhuri,et al.  When are Non-Parametric Methods Robust? , 2020, ICML.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Martin Wattenberg,et al.  Adversarial Spheres , 2018, ICLR.

[10]  John Duchi,et al.  Understanding and Mitigating the Tradeoff Between Robustness and Accuracy , 2020, ICML.

[11]  Chrisantha Fernando,et al.  PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.

[12]  Hisashi Kashima,et al.  Theoretical evidence for adversarial robustness through randomization: the case of the Exponential family , 2019, ArXiv.

[13]  Jinfeng Yi,et al.  Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach , 2018, ICLR.

[14]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[15]  Haifeng Qian,et al.  L2-Nonexpansive Neural Networks , 2018, ICLR.

[16]  Jakob Verbeek,et al.  Convolutional Neural Fabrics , 2016, NIPS.

[17]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[18]  Somesh Jha,et al.  Analyzing Accuracy Loss in Randomized Smoothing Defenses , 2020, ArXiv.

[19]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[20]  Ulrike von Luxburg,et al.  Distance-Based Classification with Lipschitz Functions , 2004, J. Mach. Learn. Res..

[21]  Greg Yang,et al.  Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[22]  J. Zico Kolter,et al.  Overfitting in adversarially robust deep learning , 2020, ICML.

[23]  Ilya P. Razenshteyn,et al.  Adversarial examples from computational constraints , 2018, ICML.

[24]  Ritu Chadha,et al.  Limitations of the Lipschitz constant as a defense against adversarial examples , 2018, Nemesis/UrbReas/SoGood/IWAISe/GDM@PKDD/ECML.

[25]  Baoyuan Wu,et al.  Toward Adversarial Robustness via Semi-supervised Robust Training , 2020, ArXiv.

[26]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[27]  Lawrence Carin,et al.  Certified Adversarial Robustness with Additive Gaussian Noise , 2018, NeurIPS 2019.

[28]  Yu Cheng,et al.  Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  John C. Duchi,et al.  Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[30]  Po-Sen Huang,et al.  Are Labels Required for Improving Adversarial Robustness? , 2019, NeurIPS.

[31]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[32]  Aditi Raghunathan,et al.  Adversarial Training Can Hurt Generalization , 2019, ArXiv.

[33]  Po-Sen Huang,et al.  An Alternative Surrogate Loss for PGD-based Adversarial Testing , 2019, ArXiv.

[34]  Michael I. Jordan,et al.  Theoretically Principled Trade-off between Robustness and Accuracy , 2019, ICML.

[35]  Jinghui Chen,et al.  Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models , 2020, AISTATS.

[36]  Larry S. Davis,et al.  Adversarial Training for Free! , 2019, NeurIPS.

[37]  Adam M. Oberman,et al.  Lipschitz regularized Deep Neural Networks generalize and are adversarially robust , 2018 .

[38]  Andrew Slavin Ross,et al.  Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients , 2017, AAAI.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[41]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[42]  Adam M. Oberman,et al.  Scaleable input gradient regularization for adversarial robustness , 2019, Machine Learning with Applications.

[43]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[44]  Vatsal Sharan,et al.  A Spectral View of Adversarially Robust Features , 2018, NeurIPS.

[45]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[46]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[47]  Greg Yang,et al.  Randomized Smoothing of All Shapes and Sizes , 2020, ICML.

[48]  Di He,et al.  Adversarially Robust Generalization Just Requires More Unlabeled Data , 2019, ArXiv.

[49]  Avrim Blum,et al.  Random Smoothing Might be Unable to Certify 𝓁∞ Robustness for High-Dimensional Images , 2020, J. Mach. Learn. Res..

[50]  Yair Weiss,et al.  A Bayes-Optimal View on Adversarial Examples , 2020, J. Mach. Learn. Res..

[51]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[52]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[53]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[54]  Aravindan Vijayaraghavan,et al.  On Robustness to Adversarial Examples and Polynomial Optimization , 2019, NeurIPS.

[55]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[56]  Pin-Yu Chen,et al.  Rethinking Randomized Smoothing for Adversarial Robustness , 2020, ArXiv.

[57]  Elvis Dohmatob,et al.  Generalized No Free Lunch Theorem for Adversarial Robustness , 2018, ICML.

[58]  Somesh Jha,et al.  Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017, ICML.

[59]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[60]  Gang Niu,et al.  Where is the Bottleneck of Adversarial Learning with Unlabeled Data? , 2019, ArXiv.

[61]  John C. Duchi,et al.  Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.

[62]  Pushmeet Kohli,et al.  Adversarial Robustness through Local Linearization , 2019, NeurIPS.

[63]  Cem Anil,et al.  Sorting out Lipschitz function approximation , 2018, ICML.

[64]  Hisashi Kashima,et al.  Theoretical evidence for adversarial robustness through randomization: the case of the Exponential family , 2019, NeurIPS.

[65]  Muni Sreenivas Pydi,et al.  Adversarial Risk via Optimal Transport and Optimal Couplings , 2020, ICML.

[66]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.