论文信息 - Headless Horseman: Adversarial Attacks on Transfer Learning Models

Headless Horseman: Adversarial Attacks on Transfer Learning Models

Transfer learning facilitates the training of task-specific classifiers using pre-trained models as feature extractors. We present a family of transferable adversarial attacks against such classifiers, generated without access to the classification head; we call these headless attacks. We first demonstrate successful transfer attacks against a victim network using only its feature extractor. This motivates the introduction of a label-blind adversarial attack. This transfer attack method does not require any information about the class-label space of the victim. Our attack lowers the accuracy of a ResNet18 trained on CIFAR10 by over 40%.

[1] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Dawn Xiaodong Song,et al. Exploring the Space of Black-box Attacks on Deep Neural Networks , 2017, ArXiv.

[4] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.

[7] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Anil K. Jain,et al. Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[9] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[10] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[11] Aleksander Madry,et al. Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors , 2018, ICLR.

[12] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[13] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[14] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[15] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[16] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[18] Ben Y. Zhao,et al. With Great Training Comes Great Vulnerability: Practical Attacks against Transfer Learning , 2018, USENIX Security Symposium.

[19] Fabio Roli,et al. Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.