论文信息 - Using Adversarial Attacks to Reveal the Statistical Bias in Machine Reading Comprehension Models - 字舞流文

Using Adversarial Attacks to Reveal the Statistical Bias in Machine Reading Comprehension Models

Pre-trained language models have achieved human-level performance on many Machine Reading Comprehension (MRC) tasks, but it remains unclear whether these models truly understand language or answer questions by exploiting statistical biases in datasets. Here, we demonstrate a simple yet effective method to attack MRC models and reveal the statistical biases in these models. We apply the method to the RACE dataset, for which the answer to each MRC question is selected from 4 options. It is found that several pre-trained language models, including BERT, ALBERT, and RoBERTa, show consistent preference to some options, even when these options are irrelevant to the question. When interfered by these irrelevant options, the performance of MRC models can be reduced from human-level performance to the chance-level performance. Human readers, however, are not clearly affected by these irrelevant options. Finally, we propose an augmented training method that can greatly reduce models’ statistical biases.

Nai Ding | Jiajie Zou | Jieyu Lin | N. Ding | Jiajie Zou | Jieyu Lin

[1] Hai Zhao,et al. Dual Co-Matching Network for Multi-choice Reading Comprehension , 2020, AAAI.

[2] Carlos Guestrin,et al. Semantically Equivalent Adversarial Rules for Debugging NLP models , 2018, ACL.

[3] Sameer Singh,et al. Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.

[4] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[5] Peng Li,et al. Option Comparison Network for Multiple-choice Reading Comprehension , 2019, ArXiv.

[6] Jinfeng Yi,et al. Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples , 2018, AAAI.

[7] Matthew Richardson,et al. MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text , 2013, EMNLP.

[8] Dejing Dou,et al. HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[9] Shuohang Wang,et al. What does BERT Learn from Multiple-Choice Reading Comprehension Datasets? , 2019, ArXiv.

[10] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[11] Wentao Ma,et al. Benchmarking Robustness of Machine Reading Comprehension Models , 2021, FINDINGS.

[12] Guokun Lai,et al. Large-scale Cloze Test Dataset Created by Teachers , 2017, EMNLP.

[13] Roger Levy,et al. STARC: Structured Annotations for Reading Comprehension , 2020, ACL.

[14] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[15] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[16] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17] Percy Liang,et al. Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[18] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[19] Omer Levy,et al. Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.

[20] Sameer Singh,et al. Generating Natural Adversarial Examples , 2017, ICLR.