ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

Adversarial patch attacks that craft the pixels in a confined region of the input images show their powerful attack effectiveness in physical environments even with noises or deformations. Existing certified defenses towards adversarial patch attacks work well on small images like MNIST and CIFAR-10 datasets, but achieve very poor certified accuracy on higher-resolution images like ImageNet. It is urgent to design both robust and effective defenses against such a practical and harmful attack in industry-level larger images. In this work, we propose the certified defense methodology that achieves high provable robustness for high-resolution images and largely improves the practicality for real adoption of the certified defense. The basic insight of our work is that the adversarial patch intends to leverage localized superficial important neurons (SIN) to manipulate the prediction results. Hence, we leverage the SIN-based DNN compression techniques to significantly improve the certified accuracy, by reducing the adversarial region searching overhead and filtering the prediction noises. Our experimental results show that the certified accuracy is increased from 36.3% (the state-of-the-art certified detection) to 60.4% on the ImageNet dataset, largely pushing the certified defenses for practical use.

[1]  Sven Gowal,et al.  Scalable Verified Training for Provably Robust Image Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Alexander Levine,et al.  (De)Randomized Smoothing for Certifiable Defense against Patch Attacks , 2020, NeurIPS.

[3]  Razvan Pascanu,et al.  Top-KAST: Top-K Always Sparse Training , 2021, NeurIPS.

[4]  Cho-Jui Hsieh,et al.  Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond , 2020, NeurIPS.

[5]  Zhanyuan Zhang,et al.  Clipped BagNet: Defending Against Sticker Attacks with Clipped Bag-of-features , 2020, 2020 IEEE Security and Privacy Workshops (SPW).

[6]  Pin-Yu Chen,et al.  Adversarial T-Shirt! Evading Person Detectors in a Physical World , 2019, ECCV.

[7]  Jamie Hayes,et al.  On Visible Adversarial Perturbations & Digital Watermarking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  J. Z. Kolter,et al.  Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Robustness Verification , 2021, NeurIPS.

[9]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[10]  Zirui Xu,et al.  LanCe: A Comprehensive and Lightweight CNN Defense Methodology against Physical Adversarial Attacks on Embedded Multimedia Applications , 2019, 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC).

[11]  Prateek Mittal,et al.  PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking , 2020, USENIX Security Symposium.

[12]  Salman Khan,et al.  Local Gradients Smoothing: Defense Against Localized Adversarial Attacks , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13]  Michael McCoyd,et al.  Minority Reports Defense: Defending Against Adversarial Patches , 2020, ACNS Workshops.

[14]  J. Zico Kolter,et al.  Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[15]  Mark Lee,et al.  On Physical Adversarial Patches for Object Detection , 2019, ArXiv.

[16]  Larry S. Davis,et al.  NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Qi Li,et al.  Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis , 2021, IEEE INFOCOM 2021 - IEEE Conference on Computer Communications.

[21]  Minyi Guo,et al.  Ptolemy: Architecture Support for Robust Deep Learning , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[24]  Cho-Jui Hsieh,et al.  Towards Stable and Efficient Training of Verifiably Robust Neural Networks , 2019, ICLR.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Zirui Xu,et al.  DoPa: A Comprehensive CNN Detection Methodology against Physical Adversarial Attacks , 2019 .

[27]  L. Davis,et al.  Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors , 2019, ECCV.

[28]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[29]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[30]  Prateek Mittal,et al.  PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches , 2021, ArXiv.

[31]  Deniz Erdogmus,et al.  Structured Adversarial Attack: Towards General Implementation and Better Interpretability , 2018, ICLR.

[32]  T. Goldstein,et al.  Certified Defenses for Adversarial Patches , 2020, ICLR.

[33]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.