论文信息 - Efficient Certified Defenses Against Patch Attacks on Image Classifiers

Efficient Certified Defenses Against Patch Attacks on Image Classifiers

Adversarial patches pose a realistic threat model for physical world attacks on autonomous systems via their perception component. Autonomous systems in safety-critical domains such as automated driving should thus contain a fail-safe fallback component that combines certifiable robustness against patches with efficient inference while maintaining high performance on clean inputs. We propose BAGCERT, a novel combination of model architecture and certification procedure that allows efficient certification. We derive a loss that enables end-to-end optimization of certified robustness against patches of different sizes and locations. On CIFAR10, BAGCERT certifies 10.000 examples in 43 seconds on a single GPU and obtains 86% clean and 60% certified accuracy against 5× 5 patches.

J. H. Metzen | Maksym Yatsura

[1] Kwok-Yan Lam,et al. Adversarial Signboard against Object Detector , 2019, BMVC.

[2] J. Zico Kolter,et al. Scaling provable adversarial defenses , 2018, NeurIPS.

[3] Xin Liu,et al. DPATCH: An Adversarial Patch Attack on Object Detectors , 2018, SafeAI@AAAI.

[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] L. Carin,et al. Certified Adversarial Robustness with Additive Noise , 2018, NeurIPS.

[6] Y. Vorobeychik,et al. Defending Against Physically Realizable Attacks on Image Classification , 2019, ICLR.

[7] Suman Jana,et al. Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[8] Mark Lee,et al. On Physical Adversarial Patches for Object Detection , 2019, ArXiv.

[9] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[10] Tom Goldstein,et al. Certified Defenses for Adversarial Patches , 2020, ICLR.

[11] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[12] Toon Goedemé,et al. Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13] Nicolas Flammarion,et al. Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks , 2020, AAAI.

[14] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[15] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[16] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[17] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[18] Salman Khan,et al. Local Gradients Smoothing: Defense Against Localized Adversarial Attacks , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19] Matthias Bethge,et al. Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[20] Franklin C. Crow,et al. Summed-area tables for texture mapping , 1984, SIGGRAPH.

[21] Logan Engstrom,et al. Synthesizing Robust Adversarial Examples , 2017, ICML.

[22] Michael J. Black,et al. Attacking Optical Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23] Jamie Hayes,et al. On Visible Adversarial Perturbations & Digital Watermarking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[25] Ekin D. Cubuk,et al. Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation , 2019, ArXiv.

[26] Sven Gowal,et al. Scalable Verified Training for Provably Robust Image Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27] Swarat Chaudhuri,et al. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[28] Min Wu,et al. Safety Verification of Deep Neural Networks , 2016, CAV.

[29] Martín Abadi,et al. Adversarial Patch , 2017, ArXiv.

[30] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33] Russ Tedrake,et al. Verifying Neural Networks with Mixed Integer Programming , 2017, ArXiv.

[34] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[35] J. Zico Kolter,et al. Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.