Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System

Prior work in standardized science exams requires support from large text corpus, such as targeted science corpus fromWikipedia or SimpleWikipedia. However, retrieving knowledge from the large corpus is time-consuming and questions embedded in complex semantic representation may interfere with retrieval. Inspired by the dual process theory in cognitive science, we propose a MetaQA framework, where system 1 is an intuitive meta-classifier and system 2 is a reasoning module. Specifically, our method based on meta-learning method and large language model BERT, which can efficiently solve science problems by learning from related example questions without relying on external knowledge bases. We evaluate our method on AI2 Reasoning Challenge (ARC), and the experimental results show that meta-classifier yields considerable classification performance on emerging question types. The information provided by meta-classifier significantly improves the accuracy of reasoning module from 46.6% to 64.2%, which has a competitive advantage over retrieval-based QA methods.

[1]  Halil Kilicoglu,et al.  Automatically Classifying Question Types for Consumer Health Questions , 2014, AMIA.

[2]  S. Sloman The empirical case for two systems of reasoning. , 1996 .

[3]  G. Reeke Marvin Minsky, The Society of Mind , 1991, Artif. Intell..

[4]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[5]  Zhiyuan Liu,et al.  NumNet: Machine Reading Comprehension with Numerical Reasoning , 2019, EMNLP.

[6]  Zhendong Niu,et al.  Concept Based Query Expansion , 2013, 2013 Ninth International Conference on Semantics, Knowledge and Grids.

[7]  Marvin Minsky,et al.  Society of Mind: A Response to Four Reviews , 1991, Artif. Intell..

[8]  Kartik Talamadupula,et al.  Answering Science Exam Questions Using Query Reformulation with Background Knowledge , 2019, AKBC.

[9]  Yong Wang,et al.  Meta-Learning for Low-Resource Neural Machine Translation , 2018, EMNLP.

[10]  Heng Ji,et al.  Improving Question Answering with External Knowledge , 2019, EMNLP.

[11]  Eduard H. Hovy,et al.  Toward Semantics-Based Answer Pinpointing , 2001, HLT.

[12]  Andreea Godea,et al.  Annotating Educational Questions for Student Response Analysis , 2018, LREC.

[13]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[14]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[17]  Gabriel Stanovsky,et al.  DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.

[18]  Jonathan Evans Dual-processing accounts of reasoning, judgment, and social cognition. , 2008, Annual review of psychology.

[19]  Oren Etzioni,et al.  My Computer Is an Honor Student - but How Intelligent Is It? Standardized Tests as a Measure of AI , 2016, AI Mag..

[20]  Peter Jansen,et al.  Framing QA as Building and Ranking Intersentence Answer Justifications , 2017, CL.

[21]  Dan Roth,et al.  Question Answering as Global Reasoning Over Semantic Abstractions , 2018, AAAI.

[22]  Clayton T. Morrison,et al.  WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference , 2018, LREC.

[23]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[24]  Peter Clark,et al.  A study of the knowledge base requirements for passing an elementary science test , 2013, AKBC '13.

[25]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[26]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[27]  Peter Jansen,et al.  Multi-class Hierarchical Question Classification for Multiple Choice Science Exams , 2019, LREC.

[28]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[29]  Peter Clark Elementary School Science and Math Tests as a Driver for AI: Take the Aristo Challenge! , 2015, AAAI.

[30]  Jesse Vig,et al.  A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.

[31]  Oren Etzioni,et al.  From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project , 2019, AI Mag..

[32]  J. Schulman,et al.  Reptile: a Scalable Metalearning Algorithm , 2018 .