论文信息 - Regional Image Perturbation Reduces Lp Norms of Adversarial Examples While Maintaining Model-to-model Transferability

Regional Image Perturbation Reduces Lp Norms of Adversarial Examples While Maintaining Model-to-model Transferability

Regional adversarial attacks often rely on complicated methods for generating adversarial perturbations, making it hard to compare their efficacy against well-known attacks. In this study, we show that effective regional perturbations can be generated without resorting to complex methods. We develop a very simple regional adversarial perturbation attack method using cross-entropy sign, one of the most commonly used losses in adversarial machine learning. Our experiments on ImageNet with multiple models reveal that, on average, $76\%$ of the generated adversarial examples maintain model-to-model transferability when the perturbation is applied to local image regions. Depending on the selected region, these localized adversarial examples require significantly less $L_p$ norm distortion (for $p \in \{0, 2, \infty\}$) compared to their non-local counterparts. These localized attacks therefore have the potential to undermine defenses that claim robustness under the aforementioned norms.

[1] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[2] Tom Goldstein,et al. Breaking certified defenses: Semantic adversarial examples with spoofed robustness certificates , 2020, ICLR.

[3] Xiaogang Wang,et al. Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Pedro H. O. Pinheiro,et al. Adversarial Framing for Image and Video Classification , 2018, AAAI.

[5] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6] Cordelia Schmid,et al. Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[8] Dumitru Erhan,et al. The (Un)reliability of saliency methods , 2017, Explainable AI.

[9] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .

[10] Deniz Erdogmus,et al. Structured Adversarial Attack: Towards General Implementation and Better Interpretability , 2018, ICLR.

[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13] Aleksander Madry,et al. Adversarially Robust Generalization Requires More Data , 2018, NeurIPS.

[14] Yutaka Satoh,et al. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[16] Matthias Hein,et al. Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks , 2019, NeurIPS.

[17] J. Zico Kolter,et al. Provable defenses against adversarial examples via the convex outer adversarial polytope , 2017, ICML.

[18] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[19] Wesley De Neve,et al. Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation , 2019, MICCAI.

[20] Ramprasaath R. Selvaraju,et al. Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .

[21] Matthias Hein,et al. Provable Robustness of ReLU networks via Maximization of Linear Regions , 2018, AISTATS.

[22] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[23] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[24] Hans P. Moravec. Rover Visual Obstacle Avoidance , 1981, IJCAI.

[25] Kouichi Sakurai,et al. One Pixel Attack for Fooling Deep Neural Networks , 2017, IEEE Transactions on Evolutionary Computation.

[26] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[27] Aleksander Madry,et al. Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[28] Jinfeng Yi,et al. Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models , 2018, ECCV.

[29] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[30] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Abubakar Abid,et al. Interpretation of Neural Networks is Fragile , 2017, AAAI.

[33] Jun Zhu,et al. Improving Black-box Adversarial Attacks with a Transfer-based Prior , 2019, NeurIPS.

[34] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[35] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[37] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.