论文信息 - Certified Adversarial Robustness via Randomized Smoothing - 字舞流文

Certified Adversarial Robustness via Randomized Smoothing

We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at this http URL.

J. Zico Kolter | J. Z. Kolter | Elan Rosenfeld | Jeremy M. Cohen | Elan Rosenfeld

[1] E. S. Pearson,et al. On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[2] E. S. Pearson,et al. THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[3] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[4] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[6] Martin J. Wainwright,et al. Randomized Smoothing for Stochastic Optimization , 2011, SIAM J. Optim..

[7] Pablo A. Parrilo,et al. The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[8] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[9] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[10] Santosh S. Vempala,et al. Bypassing KLS: Gaussian Cooling and an O^*(n3) Volume Algorithm , 2015, STOC.

[11] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Yang Song,et al. Improving the Robustness of Deep Neural Networks via Stability Training , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Moustapha Cissé,et al. Parseval Networks: Improving Robustness to Adversarial Examples , 2017, ICML.

[16] Valentina Zantedeschi,et al. Efficient Defenses Against Adversarial Attacks , 2017, AISec@CCS.

[17] Clark W. Barrett,et al. Provably Minimally-Distorted Adversarial Examples , 2017 .

[18] Xiaoyu Cao,et al. Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.

[19] Lina J. Karam,et al. A Study and Comparison of Human and Deep Learning Recognition Performance under Visual Distortions , 2017, 2017 26th International Conference on Computer Communication and Networks (ICCCN).

[20] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[21] Chih-Hong Cheng,et al. Maximum Resilience of Artificial Neural Networks , 2017, ATVA.

[22] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[23] Ashish Tiwari,et al. Output Range Analysis for Deep Neural Networks , 2017, ArXiv.

[24] Min Wu,et al. Safety Verification of Deep Neural Networks , 2016, CAV.

[25] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.

[26] Alessio Lomuscio,et al. An approach to reachability analysis for feed-forward ReLU neural networks , 2017, ArXiv.

[27] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.

[28] Nicholas Carlini,et al. On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses , 2018, ArXiv.

[29] Masashi Sugiyama,et al. Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks , 2018, NeurIPS.

[30] Ryan P. Adams,et al. Motivating the Rules of the Game for Adversarial Example Research , 2018, ArXiv.

[31] Cho-Jui Hsieh,et al. Towards Robust Neural Networks via Random Self-ensemble , 2017, ECCV.

[32] Elvis Dohmatob,et al. Limitations of adversarial robustness: strong No Free Lunch Theorem , 2018, ArXiv.

[33] J. Zico Kolter,et al. Scaling provable adversarial defenses , 2018, NeurIPS.

[34] Omar Fawzi,et al. Robustness of classifiers to uniform $\ell_p$ and Gaussian noise , 2018, AISTATS.

[35] Jinfeng Yi,et al. Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach , 2018, ICLR.

[36] Cho-Jui Hsieh,et al. Efficient Neural Network Robustness Certification with General Activation Functions , 2018, NeurIPS.

[37] Yin Tat Lee,et al. Adversarial Examples from Cryptographic Pseudo-Random Generators , 2018, ArXiv.

[38] Timothy A. Mann,et al. On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[39] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[40] Pushmeet Kohli,et al. Training verified learners with learned verifiers , 2018, ArXiv.

[41] Yizheng Chen,et al. MixTrain: Scalable Training of Formally Robust Neural Networks , 2018, ArXiv.

[42] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[43] Hamza Fawzi,et al. Adversarial vulnerability for any classifier , 2018, NeurIPS.

[44] Omar Fawzi,et al. Robustness of classifiers to uniform $\ell_p$ and Gaussian noise , 2018, AISTATS.

[45] Lawrence Carin,et al. Second-Order Adversarial Attack and Certifiable Robustness , 2018, ArXiv.

[46] Matthew Mirman,et al. Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[47] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[48] Aditi Raghunathan,et al. Semidefinite relaxations for certifying robustness to adversarial examples , 2018, NeurIPS.

[49] Aleksander Madry,et al. Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[50] Pushmeet Kohli,et al. A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.

[51] Pushmeet Kohli,et al. A Unified View of Piecewise Linear Neural Network Verification , 2017, NeurIPS.

[52] Aditi Raghunathan,et al. Certified Defenses against Adversarial Examples , 2018, ICLR.

[53] Junfeng Yang,et al. Efficient Formal Safety Analysis of Neural Networks , 2018, NeurIPS.

[54] Inderjit S. Dhillon,et al. Towards Fast Computation of Certified Robustness for ReLU Networks , 2018, ICML.

[55] Matthew Mirman,et al. Fast and Effective Robustness Certification , 2018, NeurIPS.

[56] Y. Teh,et al. Statistical Verification of Neural Networks , 2018 .

[57] Swarat Chaudhuri,et al. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[58] Matteo Fischetti,et al. Deep neural networks and mixed integer linear optimization , 2018, Constraints.

[59] Ian J. Goodfellow,et al. The Relationship Between High-Dimensional Geometry and Adversarial Examples , 2018 .

[60] Harini Kannan,et al. Adversarial Logit Pairing , 2018, NIPS 2018.

[61] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[62] Elvis Dohmatob,et al. Generalized No Free Lunch Theorem for Adversarial Robustness , 2018, ICML.

[63] Aleksander Madry,et al. Robustness May Be at Odds with Accuracy , 2018, ICLR.

[64] L. Carin,et al. Certified Adversarial Robustness with Additive Noise , 2018, NeurIPS.

[65] Suman Jana,et al. Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[66] Vinod Vaikuntanathan,et al. Computational Limitations in Robust Classification and Win-Win Results , 2019, IACR Cryptol. ePrint Arch..

[67] Greg Yang,et al. Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[68] Saeed Mahloujifar,et al. The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure , 2018, AAAI.

[69] Cem Anil,et al. Sorting out Lipschitz function approximation , 2018, ICML.

[70] Russ Tedrake,et al. Evaluating Robustness of Neural Networks with Mixed Integer Programming , 2017, ICLR.

[71] Alexander Levine,et al. Certifiably Robust Interpretation in Deep Learning , 2019, ArXiv.

[72] Ilya P. Razenshteyn,et al. Adversarial examples from computational constraints , 2018, ICML.

[73] William Fithian,et al. Rank verification for exponential families , 2016, The Annals of Statistics.

[74] Matthias Hein,et al. Provable Robustness of ReLU networks via Maximization of Linear Regions , 2018, AISTATS.

[75] Tom Goldstein,et al. Are adversarial examples inevitable? , 2018, ICLR.

[76] Nic Ford,et al. Adversarial Examples Are a Natural Consequence of Test Error in Noise , 2019, ICML.

[77] Bernhard Pfahringer,et al. Regularisation of neural networks by enforcing Lipschitz continuity , 2018, Machine Learning.