DBIA: Data-free Backdoor Injection Attack against Transformer Networks

Recently, transformer architecture has demonstrated its significance in both Natural Language Processing (NLP) and Computer Vision (CV) tasks. Though other network models are known to be vulnerable to the backdoor attack, which embeds triggers in the model and controls the model behavior when the triggers are presented, little is known whether such an attack is still valid on the transformer models and if so, whether it can be done in a more cost-efficient manner. In this paper, we propose DBIA, a novel datafree backdoor attack against the CV-oriented transformer networks, leveraging the inherent attention mechanism of transformers to generate triggers and injecting the backdoor using the poisoned surrogate dataset. We conducted extensive experiments based on three benchmark transformers, i.e., ViT, DeiT and Swin Transformer, on two mainstream image classification tasks, i.e., CIFAR10 and ImageNet. The evaluation results demonstrate that, consuming fewer resources, our approach can embed backdoors with a high success rate and a low impact on the performance of the victim transformers. Our code is available at https://anonymous.4open.science/r/DBIA-825D.

[1]  Wassim Hamidouche,et al.  Reveal of Vision Transformers Robustness against Adversarial Attacks , 2021, ArXiv.

[2]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[3]  Yuan He,et al.  Towards Robust Vision Transformer , 2021, ArXiv.

[4]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[5]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[6]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[7]  Willem Zuidema,et al.  Quantifying Attention Flow in Transformers , 2020, ACL.

[8]  Kaiming He,et al.  Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Nicolas Usunier,et al.  End-to-End Object Detection with Transformers , 2020, ECCV.

[11]  Vineeth N. Balasubramanian,et al.  Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[12]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[13]  James Bailey,et al.  Clean-Label Backdoor Attacks on Video Recognition Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yang Wang,et al.  Cross-Modal Self-Attention Network for Referring Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bao Gia Doan,et al.  Februus: Input Purification Defence Against Trojan Attacks on Deep Neural Network Systems , 2019, 1908.03369.

[16]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Matthieu Cord,et al.  Training data-efficient image transformers & distillation through attention , 2020, ICML.

[18]  Dan Boneh,et al.  SentiNet: Detecting Physical Attacks Against Deep Learning Systems , 2018, ArXiv.

[19]  Bin Li,et al.  Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.

[20]  Xiangyu Zhang,et al.  ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation , 2019, CCS.

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  Myle Ott,et al.  Scaling Neural Machine Translation , 2018, WMT.

[23]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[24]  Alexander Kolesnikov,et al.  Scaling Vision Transformers , 2021, ArXiv.

[25]  Jixun Gao,et al.  TFE: A Transformer Architecture for Occlusion Aware Facial Expression Recognition , 2021, Frontiers in Neurorobotics.

[26]  Fahad Shahbaz Khan,et al.  On Improving Adversarial Transferability of Vision Transformers , 2021, ArXiv.

[27]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[28]  Weihong Deng,et al.  Face Transformer for Recognition , 2021, ArXiv.

[29]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[30]  Marten van Dijk,et al.  On the Robustness of Vision Transformers to Adversarial Examples , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[32]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[36]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[37]  Chinmay Hegde,et al.  Adversarial Token Attacks on Vision Transformers , 2021, ArXiv.

[38]  Georg Heigold,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.

[39]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.