论文信息 - Intermediate Level Adversarial Attack for Enhanced Transferability

Intermediate Level Adversarial Attack for Enhanced Transferability

Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples may be overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. This leads us to introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model. We show that our method can effectively achieve this goal and that we can decide a nearly-optimal layer of the source model to perturb without any knowledge of the target models.

[1] Jun Zhu,et al. Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[3] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[4] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Dan Boneh,et al. The Space of Transferable Adversarial Examples , 2017, ArXiv.

[6] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[7] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] David J. Fleet,et al. Adversarial Manipulation of Deep Representations , 2015, ICLR.

[9] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[10] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[12] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14] R. Venkatesh Babu,et al. Generalizable Data-Free Objective for Crafting Universal Adversarial Perturbations , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[16] Lior Wolf,et al. Channel-Level Acceleration of Deep Face Representations , 2015, IEEE Access.

[17] Dawn Song,et al. Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[18] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[20] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).