Delving into the pixels of adversarial samples

Despite extensive research into adversarial attacks, we do not know how adversarial attacks affect image pixels. Knowing how image pixels are affected by adversarial attacks has the potential to lead us to better adversarial defenses. Motivated by instances that we find where strong attacks do not transfer, we delve into adversarial examples at pixel level to scrutinize how adversarial attacks affect image pixel values. We consider several ImageNet architectures, InceptionV3, VGG19 and ResNet50, as well as several strong attacks. We find that attacks can have different effects at pixel level depending on classifier architecture. In particular, input preprocessing plays a previously overlooked role in the effect that attacks have on pixels. Based on the insights of pixel-level examination, we find new ways to detect some of the strongest current attacks.

[1]  Yao Zhao,et al.  Adversarial Attacks and Defences Competition , 2018, ArXiv.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Aleksander Madry,et al.  On Evaluating Adversarial Robustness , 2019, ArXiv.

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Martin Wistuba,et al.  Adversarial Robustness Toolbox v1.0.0 , 2018, 1807.01069.

[7]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[9]  Lawrence Carin,et al.  Perturbing Across the Feature Hierarchy to Improve Standard and Strict Blackbox Attack Transferability , 2020, NeurIPS.

[10]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[11]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[12]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[13]  David J. Fleet,et al.  Adversarial Manipulation of Deep Representations , 2015, ICLR.

[14]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  David A. Wagner,et al.  MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[16]  David A. Wagner,et al.  Defensive Distillation is Not Robust to Adversarial Examples , 2016, ArXiv.

[17]  Yong Yang,et al.  Transferable Adversarial Perturbations , 2018, ECCV.

[18]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[19]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[20]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[21]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[22]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[23]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[24]  Hao Chen,et al.  Backpropagating Linearly Improves Transferability of Adversarial Examples , 2020, NeurIPS.

[25]  Lawrence Carin,et al.  Transferable Perturbations of Deep Feature Distributions , 2020, ICLR.

[26]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[27]  Qian Huang,et al.  Enhancing Adversarial Example Transferability With an Intermediate Level Attack , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Aleksander Madry,et al.  On Adaptive Attacks to Adversarial Example Defenses , 2020, NeurIPS.

[29]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[31]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[32]  Hai Li,et al.  Feature Space Perturbations Yield More Transferable Adversarial Examples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).