Invisible Backdoor Attack with Sample-Specific Triggers

Recently, backdoor attacks pose a new security threat to the training process of deep neural networks (DNNs). Attackers intend to inject hidden backdoors into DNNs, such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if hidden backdoors are activated by the attacker-defined trigger. Existing backdoor attacks usually adopt the setting that triggers are sample-agnostic, i.e., different poisoned samples contain the same trigger, resulting in that the attacks could be easily mitigated by current backdoor defenses. In this work, we explore a novel attack paradigm, where backdoor triggers are sample-specific. In our attack, we only need to modify certain training samples with invisible perturbation, while not need to manipulate other training components (e.g., training loss, and model structure) as required in many existing attacks. Specifically, inspired by the recent advance in DNN-based image steganography, we generate sample-specific invisible additive noises as backdoor triggers by encoding an attacker-specified string into benign images through an encoder-decoder network. The mapping from the string to the target label will be generated when DNNs are trained on the poisoned dataset. Extensive experiments on benchmark datasets verify the effectiveness of our method in attacking models with or without defenses. The code will be available at https://github.com/ yuezunli/ISSBA.

[1]  Li Fei-Fei,et al.  HiDDeN: Hiding Data With Deep Networks , 2018, ECCV.

[2]  Mohammed Ghanbari,et al.  Scope of validity of PSNR in image/video quality assessment , 2008 .

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Yong Jiang,et al.  Backdoor Attack Against Speaker Verification , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Baoyuan Wu,et al.  Unsupervised Multi-View Constrained Convolutional Network for Accurate Depth Estimation , 2020, IEEE Transactions on Image Processing.

[6]  Ren Ng,et al.  StegaStamp: Invisible Hyperlinks in Physical Photographs , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  James Bailey,et al.  Clean-Label Backdoor Attacks on Video Recognition Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Zhenyu He,et al.  Target-Aware Deep Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[10]  Ankur Srivastava,et al.  A Survey on Neural Trojans , 2020, 2020 21st International Symposium on Quality Electronic Design (ISQED).

[11]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[12]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Anh Tran,et al.  Input-Aware Dynamic Backdoor Attack , 2020, NeurIPS.

[14]  Shumeet Baluja,et al.  Hiding Images in Plain Sight: Deep Steganography , 2017, NIPS.

[15]  Konrad Rieck,et al.  Backdooring and Poisoning Neural Networks with Image-Scaling Attacks , 2020, 2020 IEEE Security and Privacy Workshops (SPW).

[16]  Jerry Li,et al.  Spectral Signatures in Backdoor Attacks , 2018, NeurIPS.

[17]  Baoyuan Wu,et al.  Backdoor Learning: A Survey , 2020, ArXiv.

[18]  Xiangyu Zhang,et al.  Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features , 2020, CCS.

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  D. Song,et al.  Towards Inspecting and Eliminating Trojan Backdoors in Deep Neural Networks , 2020, 2020 IEEE International Conference on Data Mining (ICDM).

[21]  DAVID G. KENDALL,et al.  Introduction to Mathematical Statistics , 1947, Nature.

[22]  Mani Srivastava,et al.  NeuronInspect: Detecting Backdoors in Neural Networks via Output Explanations , 2019, ArXiv.

[23]  Jinjun Xiong,et al.  Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases , 2020, ECCV.

[24]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[25]  Baoyuan Wu,et al.  Hidden Backdoor Attack against Semantic Segmentation Models , 2021, ArXiv.

[26]  Minhui Xue,et al.  Invisible Backdoor Attacks on Deep Neural Networks Via Steganography and Regularization , 2019, IEEE Transactions on Dependable and Secure Computing.

[27]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[28]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[29]  Meikang Qiu,et al.  DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation , 2021, AsiaCCS.

[30]  Zhiyuan Liu,et al.  Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution , 2021, ACL.

[31]  Pin-Yu Chen,et al.  Defending against Backdoor Attack on Deep Neural Networks , 2020, ArXiv.

[32]  Siddharth Garg,et al.  BadNets: Evaluating Backdooring Attacks on Deep Neural Networks , 2019, IEEE Access.

[33]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[34]  Damith Chinthana Ranasinghe,et al.  STRIP: a defence against trojan attacks on deep neural networks , 2019, ACSAC.

[35]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[36]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[37]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[38]  Shouling Ji,et al.  Graph Backdoor , 2020, USENIX Security Symposium.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[41]  Hamed Pirsiavash,et al.  Hidden Trigger Backdoor Attacks , 2019, AAAI.

[42]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[43]  Scott E. Coull,et al.  Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers , 2021, USENIX Security Symposium.

[44]  George Kesidis,et al.  A Backdoor Attack against 3D Point Cloud Classifiers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Reza Shokri,et al.  Bypassing Backdoor Detection Algorithms in Deep Learning , 2019, 2020 IEEE European Symposium on Security and Privacy (EuroS&P).

[46]  Yukun Yang,et al.  Defending Neural Backdoors via Generative Distribution Modeling , 2019, NeurIPS.

[47]  Edward Chou,et al.  SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems , 2020, 2020 IEEE Security and Privacy Workshops (SPW).

[48]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.