Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning

Learning transferable knowledge across similar but different settings is a fundamental component of generalized intelligence. In this paper, we approach the transfer learning challenge from a causal theory perspective. Our agent is endowed with two basic yet general theories for transfer learning: (i) a task shares a common abstract structure that is invariant across domains, and (ii) the behavior of specific features of the environment remain constant across domains. We adopt a Bayesian perspective of causal theory induction and use these theories to transfer knowledge between environments. Given these general theories, the goal is to train an agent by interactively exploring the problem space to (i) discover, form, and transfer useful abstract and structural knowledge, and (ii) induce useful knowledge from the instance-level attributes observed in the environment. A hierarchy of Bayesian structures is used to model abstract-level structural causal knowledge, and an instance-level associative learning scheme learns which specific objects can be used to induce state changes through interaction. This model-learning scheme is then integrated with a model-based planner to achieve a task in the OpenLock environment, a virtual ``escape room'' with a complex hierarchy that requires agents to reason about an abstract, generalized causal structure. We compare performances against a set of predominate model-free reinforcement learning(RL) algorithms. RL agents showed poor ability transferring learned knowledge across different trials. Whereas the proposed model revealed similar performance trends as human learners, and more importantly, demonstrated transfer behavior across trials and learning situations.

[1]  Song-Chun Zhu,et al.  Human Causal Transfer: Challenges for Deep Reinforcement Learning , 2018, CogSci.

[2]  Neil R. Bramley,et al.  Intuitive experimentation in the physical world , 2018, Cognitive Psychology.

[3]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[4]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[5]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[6]  Dileep George,et al.  Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[7]  Kenneth D. Forbus,et al.  Transfer Learning through Analogy in Games , 2011, AI Mag..

[8]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[9]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[10]  F. Heider The psychology of interpersonal relations , 1958 .

[11]  Aimee E. Stahl,et al.  Observing the unexpected enhances infants’ learning and exploration , 2015, Science.

[12]  Thomas L. Griffiths,et al.  Formalizing Neurath’s Ship: Approximate Algorithms for Online Causal Learning , 2016, Psychological review.

[13]  Shane Legg,et al.  Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.

[14]  J. Tenenbaum,et al.  Structure and strength in causal induction , 2005, Cognitive Psychology.

[15]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[16]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[17]  Yixin Zhu,et al.  Learning Perceptual Inference by Contrasting , 2019, NeurIPS.

[18]  Samy Bengio,et al.  A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.

[19]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[20]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[23]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[24]  Ashok K. Goel,et al.  Human-Guided Object Mapping for Task Transfer , 2018, ACM Transactions on Human-Robot Interaction.

[25]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[26]  C. Maclaurin,et al.  A treatise of fluxions : in two books , 1970 .

[27]  Joshua B Tenenbaum,et al.  Theory-based causal induction. , 2009, Psychological review.

[28]  J. Tenenbaum,et al.  Theory-based Bayesian models of inductive learning and reasoning , 2006, Trends in Cognitive Sciences.

[29]  Feng Gao,et al.  RAVEN: A Dataset for Relational and Analogical Visual REasoNing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Illtyd Trethowan Causality , 1938 .

[31]  Michael Thielscher,et al.  Introduction to the Fluent Calculus , 1998, Electron. Trans. Artif. Intell..

[32]  A. Dickinson,et al.  Associative Accounts of Causality Judgment , 1988 .

[33]  K. Holyoak,et al.  Causal learning and inference as a rational process: the new synthesis. , 2011, Annual review of psychology.

[34]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[36]  R. Rescorla A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[37]  Michael R. Waldmann,et al.  Predictive and diagnostic learning within causal models: asymmetries in cue competition. , 1992, Journal of experimental psychology. General.

[38]  Neil R. Bramley,et al.  Conservative forgetful scholars: How people learn causal structure through sequences of interventions. , 2015, Journal of experimental psychology. Learning, memory, and cognition.

[39]  P. Cheng From covariation to causation: A causal power theory. , 1997 .

[40]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.