Machine Reading Comprehension using Case-based Reasoning

We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI. Our method (CBR-MRC) builds upon the hypothesis that contextualized answers to similar questions share semantic similarities with each other. Given a test question, CBR-MRC first retrieves a set of similar cases from a non-parametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases. The semi-parametric nature of our approach allows it to attribute a prediction to the specific set of evidence cases, making it a desirable choice for building reliable and debuggable QA systems. We show that CBR-MRC provides high accuracy comparable with large reader models and outperforms baselines by 11.5 and 8.4 EM on NaturalQuestions and NewsQA, respectively. Further, we demonstrate the ability of CBR-MRC in identifying not just the correct answer tokens but also the span with the most relevant supporting evidence. Lastly, we observe that contexts for certain question types show higher lexical diversity than others and find that CBR-MRC is robust to these variations while performance using fully-parametric methods drops.

[1]  Sunita Sarawagi,et al.  Structured Case-based Reasoning for Inference-time Adaptation of Text-to-SQL parsers , 2023, AAAI.

[2]  Xiang Lisa Li,et al.  Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP , 2022, ArXiv.

[3]  Andrew S. Rosen,et al.  Structured information extraction from complex scientific text with fine-tuned large language models , 2022, ArXiv.

[4]  Dragomir R. Radev,et al.  Binding Language Models in Symbolic Languages , 2022, ICLR.

[5]  K. McKeown,et al.  On the Relation between Sensitivity and Accuracy in In-context Learning , 2022, EMNLP.

[6]  Asahi Ushio,et al.  T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition , 2022, EACL.

[7]  Yuhuai Wu,et al.  Solving Quantitative Reasoning Problems with Language Models , 2022, NeurIPS.

[8]  D. Schuurmans,et al.  Least-to-Most Prompting Enables Complex Reasoning in Large Language Models , 2022, ICLR.

[9]  R. Das,et al.  CBR-iKB: A Case-Based Reasoning Approach for Question Answering over Incomplete Knowledge Bases , 2022, ArXiv.

[10]  M. Lewis,et al.  Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? , 2022, Conference on Empirical Methods in Natural Language Processing.

[11]  M. Zaheer,et al.  Knowledge Base Question Answering by Case-based Reasoning over Subgraphs , 2022, ICML.

[12]  Yizhou Sun,et al.  Question-Answer Sentence Graph for Joint Modeling Answer Selection , 2022, EACL.

[13]  Jonathan Berant,et al.  Learning To Retrieve Prompts for In-Context Learning , 2021, NAACL.

[14]  Yuan Zhang,et al.  Controllable Semantic Parsing via Retrieval Augmentation , 2021, EMNLP.

[15]  Feng Ji,et al.  REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training , 2021, FINDINGS.

[16]  Rajarshi Das,et al.  Case-based Reasoning for Natural Language Queries over Knowledge Bases , 2021, EMNLP.

[17]  Weizhu Chen,et al.  What Makes Good In-Context Examples for GPT-3? , 2021, DEELIO.

[18]  Alice H. Oh,et al.  Context-Aware Answer Extraction in Question Answering , 2020, EMNLP.

[19]  Marco Valentino,et al.  A Survey on Explainability in Machine Reading Comprehension , 2020, ArXiv.

[20]  Edouard Grave,et al.  Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[21]  Jianjun Hu,et al.  A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics, and Benchmark Datasets , 2020, Applied Sciences.

[22]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[23]  Fabio Petroni,et al.  Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.

[24]  Hinrich Schütze,et al.  BERT-kNN: Adding a kNN Search Component to Pretrained Language Models for Better QA , 2020, FINDINGS.

[25]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[26]  Ting Liu,et al.  Is Graph Structure Necessary for Multi-hop Question Answering? , 2020, EMNLP.

[27]  Jonathan Berant,et al.  Evaluating the Evaluation of Diversity in Natural Language Generation , 2020, EACL.

[28]  Manzil Zaheer,et al.  A Simple Approach to Case-Based Reasoning in Knowledge Bases , 2020, AKBC.

[29]  Ming-Wei Chang,et al.  REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[30]  Yankai Lin,et al.  Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network , 2019, ArXiv.

[31]  Xiaodong He,et al.  Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents , 2019, AAAI.

[32]  Danqi Chen,et al.  MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension , 2019, EMNLP.

[33]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[34]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[35]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[36]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[37]  Ali Farhadi,et al.  Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index , 2019, ACL.

[38]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[39]  Geoffrey E. Hinton,et al.  Analyzing and Improving Representations with the Soft Nearest Neighbor Loss , 2019, ICML.

[40]  Verena Rieser,et al.  Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge , 2019, Comput. Speech Lang..

[41]  Thomas Lukasiewicz,et al.  e-SNLI: Natural Language Inference with Natural Language Explanations , 2018, NeurIPS.

[42]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[43]  Nazli Ikizler-Cinbis,et al.  RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes , 2018, EMNLP.

[44]  Eunsol Choi,et al.  QuAC: Question Answering in Context , 2018, EMNLP.

[45]  Richard Socher,et al.  Efficient and Robust Question Answering from Minimal Context over Documents , 2018, ACL.

[46]  Chris Dyer,et al.  The NarrativeQA Reading Comprehension Challenge , 2017, TACL.

[47]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[48]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[49]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[50]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[51]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[52]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[53]  Jianfeng Gao,et al.  MS MARCO: A Human Generated MAchine Reading COmprehension Dataset , 2016, CoCo@NIPS.

[54]  David Berthelot,et al.  WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia , 2016, ACL.

[55]  Marco Tulio Ribeiro,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[56]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[57]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[58]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[59]  Georgios Balikas,et al.  An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition , 2015, BMC Bioinformatics.

[60]  Lei Yu,et al.  Deep Learning for Answer Sentence Selection , 2014, ArXiv.

[61]  Matthew Richardson,et al.  MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text , 2013, EMNLP.

[62]  Sivaraman Balakrishnan,et al.  Efficient Active Algorithms for Hierarchical Clustering , 2012, ICML.

[63]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[64]  David B. Leake Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[65]  Janet L. Kolodner,et al.  Maintaining Organization in a Dynamic Long-Term Memory , 1983, Cogn. Sci..

[66]  Edwina L. Rissland,et al.  Examples in Legal Reasoning: Legal Hypotheticals , 1983, IJCAI.

[67]  Claire Cardie,et al.  End-to-end Case-Based Reasoning for Commonsense Knowledge Base Completion , 2023, EACL.

[68]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[69]  Giuseppe Attardi,et al.  Cross Attention for Selection-based Question Answering , 2018, NL4AI@AI*IA.

[70]  Roger C. Schank,et al.  Dynamic memory - a theory of reminding and learning in computers and people , 1983 .