Retrieve, Rerank, Read, then Iterate: Answering Open-Domain Questions of Arbitrary Complexity from Text

Current approaches to open-domain question answering often make crucial assumptions that prevent them from generalizing to real-world settings, including the access to parameterized retrieval systems well-tuned for the task, access to structured metadata like knowledge bases and web links, or a priori knowledge of the complexity of questions to be answered (e.g., single-hop or multi-hop). To address these limitations, we propose a unified system to answer open-domain questions of arbitrary complexity directly from text that works with off-the-shelf retrieval systems on arbitrary text collections. We employ a single multi-task model to perform all the necessary subtasks---retrieving supporting facts, reranking them, and predicting the answer from all retrieved documents---in an iterative fashion. To emulate a more realistic setting, we also constructed a new unified benchmark by collecting about 200 multi-hop questions that require three Wikipedia pages to answer, and combining them with existing datasets. We show that our model not only outperforms state-of-the-art systems on several existing benchmarks that exclusively feature single-hop or multi-hop open-domain questions, but also achieves strong performance on the new benchmark.

[1]  Ramesh Nallapati,et al.  Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering , 2019, EMNLP.

[2]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[3]  Zhen Huang,et al.  Retrieve, Read, Rerank: Towards End-to-End Multi-Document Reading Comprehension , 2019, ACL.

[4]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[5]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[6]  Wenhan Xiong,et al.  Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval , 2020, ICLR.

[7]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[8]  Ming-Wei Chang,et al.  Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[9]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[10]  Minlie Huang,et al.  Knowledge-Aided Open-Domain Question Answering , 2020, ArXiv.

[11]  Jonathan Berant,et al.  The Web as a Knowledge-Base for Answering Complex Questions , 2018, NAACL.

[12]  Wei Zhang,et al.  Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering , 2017, ICLR.

[13]  Kyunghyun Cho,et al.  SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[14]  Chen Zhao,et al.  Complex Factoid Question Answering with a Free-Text Knowledge Graph , 2020, WWW.

[15]  Ming-Wei Chang,et al.  REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[16]  Sebastian Riedel,et al.  Constructing Datasets for Multi-hop Reading Comprehension Across Documents , 2017, TACL.

[17]  Zijian Wang,et al.  Answering Complex Open-domain Questions Through Iterative Query Generation , 2019, EMNLP.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  ChengXiang Zhai,et al.  When documents are very long, BM25 fails! , 2011, SIGIR.

[20]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[21]  Jimmy J. Lin,et al.  End-to-End Open-Domain Question Answering with BERTserini , 2019, NAACL.

[22]  Rajarshi Das,et al.  Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering , 2019, ICLR.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[24]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[25]  Paul N. Bennett,et al.  Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention , 2020, ICLR.

[26]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[27]  Wei Zhang,et al.  R3: Reinforced Reader-Ranker for Open-Domain Question Answering , 2017, ArXiv.

[28]  Quoc V. Le,et al.  ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[29]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[30]  Richard Socher,et al.  Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.

[31]  Ran El-Yaniv,et al.  Multi-Hop Paragraph Retrieval for Open-Domain Question Answering , 2019, ACL.

[32]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[33]  Christopher Clark,et al.  Simple and Effective Multi-Paragraph Reading Comprehension , 2017, ACL.

[34]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[35]  Mohit Bansal,et al.  Revealing the Importance of Semantic Retrieval for Machine Reading at Scale , 2019, EMNLP.

[36]  Christopher Potts,et al.  Relevance-guided Supervision for OpenQA with ColBERT , 2020, ArXiv.

[37]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[38]  Graham Neubig,et al.  Differentiable Reasoning over a Virtual Knowledge Base , 2020, ICLR.

[39]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[40]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[41]  Edouard Grave,et al.  Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.