Explainable Multi-hop Verbal Reasoning Through Internal Monologue

Many state-of-the-art (SOTA) language models have achieved high accuracy on several multi-hop reasoning problems. However, these approaches tend to not be interpretable because they do not make the intermediate reasoning steps explicit. Moreover, models trained on simpler tasks tend to fail when directly tested on more complex problems. We propose the Explainable multi-hop Verbal Reasoner (EVR) to solve these limitations by (a) decomposing multi-hop reasoning problems into several simple ones, and (b) using natural language to guide the intermediate reasoning hops. We implement EVR by extending the classic reasoning paradigm General Problem Solver (GPS) with a SOTA generative language model to generate subgoals and perform inference in natural language at each reasoning step. Evaluation of EVR on the RuleTaker synthetic question answering (QA) dataset shows that EVR achieves SOTA performance while being able to generate all reasoning steps in natural language. Furthermore, EVR generalizes better than other strong methods when trained on simpler tasks or less training data (up to 35.7% and 7.7% absolute improvement respectively).

[1]  Ilya Sutskever,et al.  Generative Language Modeling for Automated Theorem Proving , 2020, ArXiv.

[2]  R H Logie,et al.  The role of memory in the Tower of London task. , 1999, Memory.

[3]  Mathijs Mul,et al.  Compositionality Decomposed: How do Neural Networks Generalise? , 2019, J. Artif. Intell. Res..

[4]  Chitta Baral,et al.  Careful Selection of Knowledge to Solve Open Book Question Answering , 2019, ACL.

[5]  Stephen E. Newstead,et al.  Individual differences in strategies for syllogistic reasoning , 2003 .

[6]  John R. Anderson,et al.  ACT-R: A Theory of Higher Level Cognition and Its Relation to Visual Attention , 1997, Hum. Comput. Interact..

[7]  Dan Roth,et al.  Neural Module Networks for Reasoning over Text , 2020, ICLR.

[8]  Christopher D. Manning,et al.  Compositional Attention Networks for Machine Reasoning , 2018, ICLR.

[9]  John E. Laird,et al.  The Soar Cognitive Architecture , 2012 .

[10]  Mihai Surdeanu,et al.  Quick and (not so) Dirty: Unsupervised Selection of Justification Sentences for Multi-hop Question Answering , 2019, EMNLP.

[11]  Michael Hahn,et al.  Theoretical Limitations of Self-Attention in Neural Sequence Models , 2019, TACL.

[12]  Ulf Leser,et al.  NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language , 2019, ACL.

[13]  Chong Wang,et al.  Neural Logic Machines , 2019, ICLR.

[14]  Allen Newell,et al.  Report on a general problem-solving program , 1959, IFIP Congress.

[15]  Peter Clark,et al.  Transformers as Soft Reasoners over Language , 2020, ArXiv.

[16]  Sameer Singh,et al.  Obtaining Faithful Interpretations from Compositional Neural Networks , 2020, ACL.

[17]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[18]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[19]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[20]  Mohit Bansal,et al.  PRover: Proof Generation for Interpretable Reasoning over Rules , 2020, EMNLP.

[21]  Mohit Bansal,et al.  Self-Assembling Modular Networks for Interpretable Multi-Hop Reasoning , 2019, EMNLP/IJCNLP.

[22]  Ben Alderson-Day,et al.  Inner Speech: Development, Cognitive Functions, Phenomenology, and Neurobiology , 2015, Psychological bulletin.

[23]  Richard Reviewer-Granger Unified Theories of Cognition , 1991, Journal of Cognitive Neuroscience.

[24]  Hannaneh Hajishirzi,et al.  Multi-hop Reading Comprehension through Question Decomposition and Rescoring , 2019, ACL.

[25]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[26]  Daniel Deutch,et al.  Break It Down: A Question Understanding Benchmark , 2020, TACL.

[27]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[28]  A. Newell ON THE ANALYSIS OF HUMAN PROBLEM SOLVING PROTOCOLS , 1966 .

[29]  Richard Socher,et al.  Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.