Input-Aware Dynamic Backdoor Attack

In recent years, neural backdoor attack has been considered to be a potential security threat to deep learning systems. Such systems, while achieving the state-of-the-art performance on clean data, perform abnormally on inputs with predefined triggers. Current backdoor techniques, however, rely on uniform trigger patterns, which are easily detected and mitigated by current defense methods. In this work, we propose a novel backdoor attack technique in which the triggers vary from input to input. To achieve this goal, we implement an input-aware trigger generator driven by diversity loss. A novel cross-trigger test is applied to enforce trigger nonreusablity, making backdoor verification impossible. Experiments show that our method is efficient in various attack scenarios as well as multiple datasets. We further demonstrate that our backdoor can bypass the state of the art defense methods. An analysis with a famous neural network inspector again proves the stealthiness of the proposed attack. Our code is publicly available at this https URL.

[1]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Damith Chinthana Ranasinghe,et al.  STRIP: a defence against trojan attacks on deep neural networks , 2019, ACSAC.

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Shanshan Peng,et al.  Model Agnostic Defence Against Backdoor Attacks in Machine Learning , 2019, IEEE Transactions on Reliability.

[7]  Xiangyu Zhang,et al.  ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation , 2019, CCS.

[8]  Damith C. Ranasinghe,et al.  Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems , 2020, ACSAC.

[9]  Michael Backes,et al.  Dynamic Backdoor Attacks Against Machine Learning Models , 2020, ArXiv.

[10]  Karthikeyan Natesan Ramamurthy,et al.  Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness , 2020, ICLR.

[11]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[12]  Brendan Dolan-Gavitt,et al.  Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks , 2018, RAID.

[13]  Hao Cheng,et al.  Defending against Backdoor Attack on Deep Neural Networks , 2020, ArXiv.

[14]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[15]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[18]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.

[19]  Ben Y. Zhao,et al.  Latent Backdoor Attacks on Deep Neural Networks , 2019, CCS.

[20]  Yu Ji,et al.  Programmable Neural Network Trojan for Pre-Trained Feature Extractor , 2019, ArXiv.

[21]  Jerry Li,et al.  Spectral Signatures in Backdoor Attacks , 2018, NeurIPS.

[22]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.