论文信息 - (Certified!!) Adversarial Robustness for Free! - 字舞流文

(Certified!!) Adversarial Robustness for Free!

In this paper we show how to achieve state-of-the-art certified adversarial robustness to 2-norm bounded perturbations by relying exclusively on off-the-shelf pretrained models. To do so, we instantiate the denoised smoothing approach of Salman et al. 2020 by combining a pretrained denoising diffusion probabilistic model and a standard high-accuracy classifier. This allows us to certify 71% accuracy on ImageNet under adversarial perturbations constrained to be within an 2-norm of 0.5, an improvement of 14 percentage points over the prior certified SoTA using any approach, or an improvement of 30 percentage points over denoised smoothing. We obtain these results using only pretrained diffusion models and image classifiers, without requiring any fine tuning or retraining of model parameters.

J. Z. Kolter | Florian Tramèr | Nicholas Carlini | Krishnamurthy Dvijotham

[1] Tero Karras,et al. Elucidating the Design Space of Diffusion-Based Generative Models , 2022, NeurIPS.

[2] Anima Anandkumar,et al. Diffusion Models for Adversarial Purification , 2022, ICML.

[3] Martin T. Vechev,et al. Robust and Accurate - Compositional Architectures for Randomized Smoothing , 2022, ArXiv.

[4] Xiaojun Xu,et al. On the Certified Robustness for Ensemble Models and Beyond , 2021, ICLR.

[5] Li Dong,et al. BEiT: BERT Pre-Training of Image Transformers , 2021, ICLR.

[6] Marc Fischer,et al. Boosting Randomized Smoothing with Variance Reduced Classifiers , 2021, ICLR.

[7] Minkyu Kim,et al. SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness , 2021, NeurIPS.

[8] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[9] Prafulla Dhariwal,et al. Improved Denoising Diffusion Probabilistic Models , 2021, ICML.

[10] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[11] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2020, ICLR.

[12] Kyungmi Lee. PROVABLE DEFENSE BY DENOISED SMOOTHING WITH LEARNED SCORE FUNCTION , 2021 .

[13] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[14] Jinwoo Shin,et al. Consistency Regularization for Certified Robustness of Smoothed Classifiers , 2020, NeurIPS.

[15] Mingjie Sun,et al. Denoised Smoothing: A Provable Defense for Pretrained Classifiers , 2020, NeurIPS.

[16] Cho-Jui Hsieh,et al. MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius , 2020, ICLR.

[17] Greg Yang,et al. Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers , 2019, NeurIPS.

[18] J. Zico Kolter,et al. Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[19] Suman Jana,et al. Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[20] Timothy A. Mann,et al. On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models , 2018, ArXiv.

[21] Matthew Mirman,et al. Differentiable Abstract Interpretation for Provably Robust Neural Networks , 2018, ICML.

[22] Aleksander Madry,et al. Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[23] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[24] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[25] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[26] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[27] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[28] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[29] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[30] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.