论文信息 - An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering - 字舞流文

An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering

To produce a domain-agnostic question answering model for the Machine Reading Question Answering (MRQA) 2019 Shared Task, we investigate the relative benefits of large pre-trained language models, various data sampling strategies, as well as query and context paraphrases generated by back-translation. We find a simple negative sampling technique to be particularly effective, even though it is typically used for datasets that include unanswerable questions, such as SQuAD 2.0. When applied in conjunction with per-domain sampling, our XLNet (Yang et al., 2019)-based submission achieved the second best Exact Match and F1 in the MRQA leaderboard competition.

Zhucheng Tu | Yi Lu | Shayne Longpre | Chris DuBois | Christopher DuBois | S. Longpre | Yi Lu | Zhucheng Tu

[1] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[2] Mitesh M. Khapra,et al. DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension , 2018, ACL.

[3] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[4] Percy Liang,et al. Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[5] Quoc V. Le,et al. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[6] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[7] Georgios Balikas,et al. An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition , 2015, BMC Bioinformatics.

[8] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[9] Gabriel Stanovsky,et al. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.

[10] Jonghyun Choi,et al. Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Jonathan Berant,et al. MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension , 2019, ACL.

[12] Yoshua Bengio,et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[13] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[14] Philip Bachman,et al. NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[15] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[16] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[17] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[18] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[19] Omer Levy,et al. Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[20] Kyunghyun Cho,et al. SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[21] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[23] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[24] Lei Yu,et al. Learning and Evaluating General Linguistic Intelligence , 2019, ArXiv.

[25] Kenton Lee,et al. A BERT Baseline for the Natural Questions , 2019, ArXiv.