Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
暂无分享,去创建一个
Xipeng Qiu | Linyang Li | Demin Song | Ruotian Ma | Xiaonan Li | Jiehang Zeng | Xipeng Qiu | Jiehang Zeng | Linyang Li | Ruotian Ma | Xiaonan Li | Xiaonan Li | Demin Song
[1] Xipeng Qiu,et al. A Survey of Transformers , 2021, AI Open.
[2] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[3] Tudor Dumitras,et al. Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.
[4] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[5] Alex Wang,et al. What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.
[6] Siwei Lyu,et al. Backdoor Attack with Sample-Specific Triggers , 2020, ArXiv.
[7] Ankur Srivastava,et al. Neural Trojans , 2017, 2017 IEEE International Conference on Computer Design (ICCD).
[8] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[9] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[10] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[11] Peter Szolovits,et al. Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment , 2019, ArXiv.
[12] Omer Levy,et al. What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.
[13] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.
[14] Hamed Pirsiavash,et al. Hidden Trigger Backdoor Attacks , 2019, AAAI.
[15] Wen-Chuan Lee,et al. Trojaning Attack on Neural Networks , 2018, NDSS.
[16] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[17] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] Xuancheng Ren,et al. Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models , 2021, NAACL.
[20] Brendan Dolan-Gavitt,et al. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.
[21] Michael Backes,et al. BadNL: Backdoor Attacks Against NLP Models , 2020, ArXiv.
[22] Sameer Singh,et al. Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.
[23] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[24] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[25] Baoyuan Wu,et al. Rethinking the Trigger of Backdoor Attack , 2020, ArXiv.
[26] Xipeng Qiu,et al. Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.
[27] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[28] Dawn Xiaodong Song,et al. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.
[29] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[30] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.
[31] Dejing Dou,et al. HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.
[32] Jishen Zhao,et al. DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks , 2019, IJCAI.
[33] Yufeng Li,et al. A Backdoor Attack Against LSTM-Based Text Classification Systems , 2019, IEEE Access.
[34] Graham Neubig,et al. Weight Poisoning Attacks on Pretrained Models , 2020, ACL.
[35] Zhiyuan Liu,et al. Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-level Backdoor Attacks , 2021, Machine Intelligence Research.
[36] Anh Tran,et al. Input-Aware Dynamic Backdoor Attack , 2020, NeurIPS.
[37] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[38] Xipeng Qiu,et al. BERT-ATTACK: Adversarial Attack against BERT Using BERT , 2020, EMNLP.
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Georgios Paliouras,et al. A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists , 2004, Information Retrieval.