Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering

Recently, open-domain question answering (QA) has been combined with machine comprehension models to find answers in a large knowledge source. As open-domain QA requires retrieving relevant documents from text corpora to answer questions, its performance largely depends on the performance of document retrievers. However, since traditional information retrieval systems are not effective in obtaining documents with a high probability of containing answers, they lower the performance of QA systems. Simply extracting more documents increases the number of irrelevant documents, which also degrades the performance of QA systems. In this paper, we introduce Paragraph Ranker which ranks paragraphs of retrieved documents for a higher answer recall with less noise. We show that ranking paragraphs and aggregating answers using Paragraph Ranker improves performance of open-domain QA pipeline on the four open-domain QA datasets by 7.8% on average.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Petr Baudis,et al.  Modeling of the Question Answering Task in the YodaQA System , 2015, CLEF.

[3]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[4]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[5]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[6]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[7]  Christopher Clark,et al.  Simple and Effective Multi-Paragraph Reading Comprehension , 2017, ACL.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Wei Zhang,et al.  Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering , 2017, ICLR.

[11]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[12]  Jimmy Lin,et al.  An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures , 2017, ArXiv.

[13]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[14]  Wei Zhang,et al.  R3: Reinforced Ranker-Reader for Open-Domain Question Answering , 2018, AAAI.

[15]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[16]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[17]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[18]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.