论文信息 - AmbigQA: Answering Ambiguous Open-domain Questions - 字舞流文

AmbigQA: Answering Ambiguous Open-domain Questions

Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer. In this paper, we introduce AmbigQA, a new open-domain question answering task which involves predicting a set of question-answer pairs, where every plausible answer is paired with a disambiguated rewrite of the original question. To study this task, we construct AmbigNQ, a dataset covering 14,042 questions from NQ-open, an existing open-domain QA benchmark. We find that over half of the questions in NQ-open are ambiguous, exhibiting diverse types of ambiguity. We also present strong baseline models for AmbigQA which we show benefit from weakly supervised learning that incorporates NQ-open, strongly suggesting our new task and data will support significant future research effort. Our data is available at this https URL.

Julian Michael | Hannaneh Hajishirzi | Luke Zettlemoyer | Sewon Min | Luke Zettlemoyer | Hannaneh Hajishirzi | Sewon Min | Julian Michael

[1] Danqi Chen,et al. Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering , 2019, ArXiv.

[2] Jordan Boyd-Graber,et al. Can You Unpack That? Learning to Rewrite Questions-in-Context , 2019, EMNLP.

[3] Ido Dagan,et al. Crowdsourcing Question-Answer Meaning Representations , 2017, NAACL.

[4] W. Bruce Croft,et al. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations , 2019, SIGIR.

[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[7] Bill Byrne,et al. On NMT Search Errors and Model Errors: Cat Got Your Tongue? , 2019, EMNLP.

[8] Hal Daumé,et al. Answer-based Adversarial Training for Generating Clarification Questions , 2019, NAACL.

[9] Richard Socher,et al. Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.

[10] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[11] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[12] Hamed Zamani,et al. MIMICS: A Large-Scale Data Collection for Search Clarification , 2020, CIKM.

[13] Danqi Chen,et al. A Discrete Hard EM Approach for Weakly Supervised Question Answering , 2019, EMNLP.

[14] Hal Daumé,et al. Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information , 2018, ACL.

[15] Kyunghyun Cho,et al. SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[16] Nan Duan,et al. Asking Clarification Questions in Knowledge-Based Question Answering , 2019, EMNLP.

[17] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[19] Eugene Agichtein,et al. What Do You Mean Exactly?: Analyzing Clarification Questions in CQA , 2017, CHIIR.

[20] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[21] Andrew Chou,et al. Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[22] Ming-Wei Chang,et al. BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.

[23] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[24] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[25] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[26] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[27] Sunita Sarawagi,et al. Length bias in Encoder Decoder Models and a Case for Global Conditioning , 2016, EMNLP.

[28] Ellen M. Voorhees,et al. The TREC-8 Question Answering Track Report , 1999, TREC.

[29] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.