Tell me why! - Explanations support learning of relational and causal structure

Explanations play a considerable role in human learning, especially in areas that remain major challenges for AI—forming abstractions, and learning about the relational and causal structure of the world. Here, we explore whether reinforcement learning agents might likewise benefit from explanations. We outline a family of relational tasks that involve selecting an object that is the odd one out in a set (i.e., unique along one of many possible feature dimensions). Odd-one-out tasks require agents to reason over multi-dimensional relationships among a set of objects. We show that agents do not learn these tasks well from reward alone, but achieve > 90% performance when they are also trained to generate language explaining object properties or why a choice is correct or incorrect. In further experiments, we show how predicting explanations enables agents to generalize appropriately from ambiguous, causally-confounded training, and even to meta-learn to perform experimental interventions to identify causal structure. We show that explanations help overcome the tendency of agents to fixate on simple features, and explore which aspects of explanations make them most beneficial. Our results suggest that learning from explanations is a powerful principle that could offer a promising path towards training more robust and general machine learning systems. Explanations—language that provides explicit information about the abstract, causal structure of the world—are central to human learning (Keil et al., 2000; Lombrozo, 2006). Explanations help solve the credit assignment problem, because they link a concrete situation to generalizable abstractions that can be used in the future (Lombrozo, 2006; Lombrozo and Carey, 2006). Thus explanations allow us to learn efficiently, from otherwise underspecified examples (Ahn et al., 1992). Human explanations selectively highlight generalizable causal factors and thereby improve our causal understanding (Lombrozo and Carey, 2006). Similarly, they help us to make comparisons and master relational and analogical reasoning (Gentner and Christie, 2008; Lupyan, 2008; Edwards et al., 2019). Even explaining to ourselves, without feedback, can improve our ability to generalize (Chi et al., 1 ar X iv :2 11 2. 03 75 3v 2 [ cs .L G ] 8 D ec 2 02 1

[1]  J. Bruner,et al.  The role of tutoring in problem solving. , 1976, Journal of child psychology and psychiatry, and allied disciplines.

[2]  Raimo Tuomela A Pragmatic Theory of Explanation , 1984 .

[3]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[4]  R. Mooney,et al.  Schema acquisition from a single example , 1992 .

[5]  Michelene T. H. Chi,et al.  Eliciting Self-Explanations Improves Understanding , 1994, Cogn. Sci..

[6]  A. Gopnik,et al.  The scientist in the crib : minds, brains, and how children learn , 1999 .

[7]  David M. Sobel,et al.  Detecting blickets: how young children use information about novel causal powers in categorization and induction. , 2000, Child development.

[8]  Robert A. Wilson,et al.  Explanation and Cognition , 2000 .

[9]  D. Gentner,et al.  Language in Mind: Advances in the Study of Language and Thought , 2003 .

[10]  Dedre Gentner,et al.  Why we’re so smart , 2003 .

[11]  T. Lombrozo The structure and function of explanations , 2006, Trends in Cognitive Sciences.

[12]  S. Carey,et al.  Functional explanation and the function of explanation , 2006, Cognition.

[13]  B. Rittle-Johnson,et al.  Promoting transfer: effects of self-explanation and direct instruction. , 2006, Child development.

[14]  Dedre Gentner,et al.  Relational language supports relational cognition in humans and apes , 2008, Behavioral and Brain Sciences.

[15]  G. Lupyan Taking symbols for granted? Is the discontinuity between human and nonhuman minds the product of external symbol systems? , 2008, Behavioral and Brain Sciences.

[16]  Derek C. Penn,et al.  Darwin's mistake: Explaining the discontinuity between human and nonhuman minds , 2008, Behavioral and Brain Sciences.

[17]  Daniel J. Navarro,et al.  One of these greebles is not like the others: Semi-supervised models for similarity structures , 2008 .

[18]  E. Warrington,et al.  The different representational frameworks underpinning abstract and concrete knowledge: Evidence from odd-one-out judgements , 2009, Quarterly journal of experimental psychology.

[19]  Jivko Sinapov,et al.  The odd one out task: Toward an intelligence test for robots , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[20]  Joseph Jay Williams,et al.  The role of explanation in discovery and generalization: evidence from category learning , 2010, ICLS.

[21]  Robert L. Goldstone,et al.  Concreteness Fading in Mathematics and Science Instruction: a Systematic Review , 2014 .

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[24]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[25]  Demis Hassabis,et al.  Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.

[26]  Chris Sauer,et al.  Beating Atari with Natural Language Guided Reinforcement Learning , 2017, ArXiv.

[27]  Andrew Slavin Ross,et al.  Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations , 2017, IJCAI.

[28]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[29]  Michael R. Waldmann,et al.  The Oxford handbook of causal reasoning , 2017 .

[30]  Le Song,et al.  Learning to Explain: An Information-Theoretic Perspective on Model Interpretation , 2018, ICML.

[31]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[32]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[33]  Thomas Lukasiewicz,et al.  e-SNLI: Natural Language Inference with Natural Language Explanations , 2018, NeurIPS.

[34]  Dan Klein,et al.  Learning with Latent Language , 2017, NAACL.

[35]  Judea Pearl,et al.  Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution , 2018, WSDM.

[36]  Prasoon Goyal,et al.  Using Natural Language for Reward Shaping in Reinforcement Learning , 2019, IJCAI.

[37]  Judea Pearl,et al.  The seven tools of causal inference, with reflections on machine learning , 2019, Commun. ACM.

[38]  Aaron van den Oord,et al.  Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.

[39]  Sergey Levine,et al.  Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.

[40]  Shimon Whiteson,et al.  A Survey of Reinforcement Learning Informed by Natural Language , 2019, IJCAI.

[41]  Zeb Kurth-Nelson,et al.  Causal Reasoning from Meta-reinforcement Learning , 2019, ArXiv.

[42]  Yee Whye Teh,et al.  Meta reinforcement learning as task inference , 2019, ArXiv.

[43]  D. Gentner,et al.  Explanation recruits comparison in a category-learning task , 2019, Cognition.

[44]  Chelsea Finn,et al.  Language as an Abstraction for Hierarchical Deep Reinforcement Learning , 2019, NeurIPS.

[45]  Manuela Veloso,et al.  Generation of Policy-Level Explanations for Reinforcement Learning , 2019, AAAI.

[46]  Andrew Kyle Lampinen,et al.  What shapes feature representations? Exploring datasets, architectures, and training , 2020, NeurIPS.

[47]  Ilya Kostrikov,et al.  Automatic Data Augmentation for Generalization in Deep Reinforcement Learning , 2020, ArXiv.

[48]  Noah D. Goodman,et al.  Shaping Visual Representations with Language for Few-Shot Classification , 2019, ACL.

[49]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[50]  Razvan Pascanu,et al.  Stabilizing Transformers for Reinforcement Learning , 2019, ICML.

[51]  Oleg O. Sushkov,et al.  Scaling data-driven robotics with reward sketching and batch reinforcement learning , 2019, Robotics: Science and Systems.

[52]  Christopher Potts,et al.  Relational reasoning and generalization using non-symbolic neural networks , 2020, CogSci.

[53]  Kristian Kersting,et al.  Making deep neural networks right for the right scientific reasons by interacting with their explanations , 2020, Nat. Mach. Intell..

[54]  Gary Marcus,et al.  The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence , 2020, ArXiv.

[55]  Francisco S. Melo,et al.  Learning from Explanations and Demonstrations: A Pilot Study , 2020, NL4XAI.

[56]  James L. McClelland,et al.  Environmental drivers of systematicity and generalization in a situated agent , 2019, ICLR.

[57]  Luis C. Lamb,et al.  Neurosymbolic AI: the 3rd wave , 2020, Artificial Intelligence Review.

[58]  Alan Yuille,et al.  Visual analogy: Deep learning versus compositional models , 2021, ArXiv.

[59]  K. Kersting,et al.  Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Mudit Verma,et al.  Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation , 2020, NeurIPS.

[61]  J. Bowers,et al.  Can Deep Convolutional Neural Networks Learn Same-Different Relations? , 2021, bioRxiv.

[62]  A. Wright,et al.  Issues in the comparative cognition of same/different abstract-concept learning , 2021, Current Opinion in Behavioral Sciences.

[63]  Mohit Bansal,et al.  When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data , 2021, LNLS.

[64]  K. Holyoak,et al.  Emergence of relational reasoning , 2021, Current Opinion in Behavioral Sciences.

[65]  Zeb Kurth-Nelson,et al.  Alchemy: A structured task distribution for meta-reinforcement learning , 2021, ArXiv.

[66]  S. Gershman,et al.  Memory as a Computational Resource , 2021, Trends in Cognitive Sciences.

[67]  Andrew Kyle Lampinen,et al.  Symbolic Behaviour in Artificial Intelligence , 2021, ArXiv.

[68]  James L. McClelland,et al.  What underlies rapid learning and systematic generalization in humans , 2021, ArXiv.

[69]  Francesca Toni,et al.  Explanation-Based Human Debugging of NLP Models: A Survey , 2021, Transactions of the Association for Computational Linguistics.

[70]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[71]  Marcel van Gerven,et al.  Explainable Deep Learning: A Field Guide for the Uninitiated , 2020, J. Artif. Intell. Res..