RankQA: Neural Question Answering with Answer Re-Ranking

The conventional paradigm in neural question answering (QA) for narrative content is limited to a two-stage process: first, relevant text passages are retrieved and, subsequently, a neural network for machine comprehension extracts the likeliest answer. However, both stages are largely isolated in the status quo and, hence, information from the two phases is never properly fused. In contrast, this work proposes RankQA: RankQA extends the conventional two-stage process in neural QA with a third stage that performs an additional answer re-ranking. The re-ranking leverages different features that are directly extracted from the QA pipeline, i.e., a combination of retrieval and comprehension features. While our intentionally simple design allows for an efficient, data-sparse estimation, it nevertheless outperforms more complex QA systems by a significant margin: in fact, RankQA achieves state-of-the-art performance on 3 out of 4 benchmark datasets. Furthermore, its performance is especially superior in settings where the size of the corpus is dynamic. Here the answer re-ranking provides an effective remedy against the underlying noise-information trade-off due to a variable corpus size. As a consequence, RankQA represents a novel, powerful, and thus challenging baseline for future research in content-based QA.

[1]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[2]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[3]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[4]  Zhiyuan Liu,et al.  Denoising Distantly Supervised Open-Domain Question Answering , 2018, ACL.

[5]  Jimmy J. Lin,et al.  End-to-End Open-Domain Question Answering with BERTserini , 2019, NAACL.

[6]  Jimmy J. Lin,et al.  Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering , 2019, ArXiv.

[7]  Wei Zhang,et al.  Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering , 2017, ICLR.

[8]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[9]  Petr Baudis,et al.  Modeling of the Question Answering Task in the YodaQA System , 2015, CLEF.

[10]  Dietrich Klakow,et al.  Exploring Correlation of Dependency Relation Paths for Answer Extraction , 2006, ACL.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Dan Roth,et al.  Learning question classifiers: the role of semantic information , 2005, Natural Language Engineering.

[13]  Stefan Feuerriegel,et al.  Adaptive Document Retrieval for Deep Question Answering , 2018, EMNLP.

[14]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Alessandro Moschitti,et al.  Structural relationships for large-scale learning of answer re-ranking , 2012, SIGIR '12.

[17]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[18]  Jaewoo Kang,et al.  Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering , 2018, EMNLP.

[19]  Ming Zhou,et al.  Reranking Answers for Definitional QA Using Language Modeling , 2006, ACL.

[20]  Sanda M. Harabagiu,et al.  FALCON: Boosting Knowledge for Answer Engines , 2000, TREC.

[21]  Richard Socher,et al.  Efficient and Robust Question Answering from Minimal Context over Documents , 2018, ACL.

[22]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[23]  Alessandro Moschitti,et al.  Linguistic kernels for answer re-ranking in question answering systems , 2011, Inf. Process. Manag..

[24]  Wei Zhang,et al.  R3: Reinforced Reader-Ranker for Open-Domain Question Answering , 2017, ArXiv.