TextAttack: A Framework for Adversarial Attacks, Data Augmentation, and Adversarial Training in NLP

While there has been substantial research using adversarial attacks to analyze NLP models, each attack is implemented in its own code repository. It remains challenging to develop NLP attacks and utilize them to improve model performance. This paper introduces TextAttack, a Python framework for adversarial attacks, data augmentation, and adversarial training in NLP. TextAttack builds attacks from four components: a goal function, a set of constraints, a transformation, and a search method. TextAttack’s modular design enables researchers to easily construct attacks from combinations of novel and existing components. TextAttack provides implementations of 16 adversarial attacks from the literature and supports a variety of models and datasets, including BERT and other transformers, and all GLUE tasks. TextAttack also includes data augmentation and adversarial training modules for using components of adversarial attacks to improve model accuracy and robustness.TextAttack is democratizing NLP: anyone can try data augmentation and adversarial training on any model or dataset, with just a few lines of code. Code and tutorials are available at https://github.com/QData/TextAttack.

[1]  Siddhant Garg,et al.  BAE: BERT-based Adversarial Examples for Text Classification , 2020, EMNLP.

[2]  Alon Lavie,et al.  Meteor, M-BLEU and M-TER: Evaluation Metrics for High-Correlation with Human Rankings of Machine Translation Output , 2008, WMT@ACL.

[3]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[4]  Wanxiang Che,et al.  Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency , 2019, ACL.

[5]  Ting Wang,et al.  TextBugger: Generating Adversarial Text Against Real-world Applications , 2018, NDSS.

[6]  Maja Popovic,et al.  chrF: character n-gram F-score for automatic MT evaluation , 2015, WMT@EMNLP.

[7]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[8]  Yonghui Wu,et al.  Exploring the Limits of Language Modeling , 2016, ArXiv.

[9]  Ian J. Goodfellow,et al.  Technical Report on the CleverHans v2.1.0 Adversarial Examples Library , 2016 .

[10]  Meng Zhang,et al.  Textual Adversarial Attack as Combinatorial Optimization , 2019, 1910.12196.

[11]  Bhuwan Dhingra,et al.  Combating Adversarial Misspellings with Robust Word Recognition , 2019, ACL.

[12]  Aditi Raghunathan,et al.  Certified Robustness to Adversarial Word Substitutions , 2019, EMNLP.

[13]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[14]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[15]  Carmela Troncoso,et al.  Evading classifiers in discrete domains with provable optimality guarantees , 2018, ArXiv.

[16]  Yejin Choi,et al.  Learning to Write with Cooperative Discriminators , 2018, ACL.

[17]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[18]  Graham Neubig,et al.  On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models , 2019, NAACL.

[19]  Shi Feng,et al.  Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.

[20]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[21]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[22]  Peter Szolovits,et al.  Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment , 2019, ArXiv.

[23]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[24]  David Vandyke,et al.  Counter-fitting Word Vectors to Linguistic Constraints , 2016, NAACL.

[25]  Stefano Ermon,et al.  Adversarial Examples for Natural Language Classification Problems , 2018 .

[26]  Min-Yen Kan,et al.  It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations , 2020, ACL.

[27]  James Kennedy,et al.  Particle swarm optimization , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[28]  Xiaosen Wang,et al.  Natural Language Adversarial Attacks and Defenses in Word Level , 2019, ArXiv.

[29]  Rémi Louf,et al.  Transformers : State-ofthe-art Natural Language Processing , 2019 .

[30]  Mani B. Srivastava,et al.  Generating Natural Language Adversarial Examples , 2018, EMNLP.

[31]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[32]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[33]  Xipeng Qiu,et al.  BERT-ATTACK: Adversarial Attack against BERT Using BERT , 2020, EMNLP.

[34]  Sameer Singh,et al.  AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models , 2019, EMNLP.

[35]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[36]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[37]  Yanjun Qi,et al.  Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[38]  Jinfeng Yi,et al.  Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples , 2018, AAAI.

[39]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.