Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering

Open-domain question answering remains a challenging task as it requires models that are capable of understanding questions and answers, collecting useful information, and reasoning over evidence. Previous work typically formulates this task as a reading comprehension or entailment problem given evidence retrieved from search engines. However, existing techniques struggle to retrieve indirectly related evidence when no directly related evidence is provided, especially for complex questions where it is hard to parse precisely what the question asks. In this paper we propose a retriever-reader model that learns to attend on essential terms during the question answering process. We build (1) an essential term selector which first identifies the most important words in a question, then reformulates the query and searches for related evidence; and (2) an enhanced reader that distinguishes between essential terms and distracting words to predict the answer. We evaluate our model on multiple open-domain multiple-choice QA datasets, notably performing at the level of the state-of-the-art on the AI2 Reasoning Challenge (ARC) dataset.

[1]  Dan Roth,et al.  Learning What is Essential in Questions , 2017, CoNLL.

[2]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[3]  Claire Cardie,et al.  Improving Machine Reading Comprehension with General Reading Strategies , 2018, NAACL.

[4]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[5]  Richard Socher,et al.  Efficient and Robust Question Answering from Minimal Context over Documents , 2018, ACL.

[6]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[7]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[8]  Rajarshi Das,et al.  A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset , 2018, QA@ACL.

[9]  Julian J. McAuley,et al.  Addressing Complex and Subjective Product-Related Queries with Customer Reviews , 2015, WWW.

[10]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[11]  Wei Zhao,et al.  Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension , 2018, SemEval@NAACL-HLT.

[12]  Rajarshi Das,et al.  Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering , 2019, ICLR.

[13]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[14]  Shiyu Chang,et al.  A Co-Matching Model for Multi-choice Reading Comprehension , 2018, ACL.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Clayton T. Morrison,et al.  WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference , 2018, LREC.

[17]  Kyunghyun Cho,et al.  Task-Oriented Query Reformulation with Reinforcement Learning , 2017, EMNLP.

[18]  Jannis Bulian,et al.  Ask the Right Questions: Active Question Reformulation with Reinforcement Learning , 2017, ICLR.

[19]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[20]  Ming Zhou,et al.  Improving Question Answering by Commonsense-Based Pre-Training , 2018, NLPCC.

[21]  Peter Jansen,et al.  Framing QA as Building and Ranking Intersentence Answer Justifications , 2017, CL.

[22]  Oren Etzioni,et al.  Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.

[23]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[24]  Siu Cheung Hui,et al.  Multi-range Reasoning for Machine Comprehension , 2018, ArXiv.

[25]  Wei Zhang,et al.  R3: Reinforced Ranker-Reader for Open-Domain Question Answering , 2018, AAAI.

[26]  Junji Tomita,et al.  Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension , 2018, CIKM.

[27]  Peter Jansen,et al.  Controlling Information Aggregation for Complex Question Answering , 2018, ECIR.

[28]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[29]  Peter Clark,et al.  Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.

[30]  Simon Ostermann,et al.  SemEval-2018 Task 11: Machine Comprehension Using Commonsense Knowledge , 2018, *SEMEVAL.

[31]  Le Song,et al.  KG^2: Learning to Reason Science Exam Questions with Contextual Knowledge Graph Embeddings , 2018, ArXiv.