Causal Induction from Visual Observations for Goal Directed Tasks

Causal reasoning has been an indispensable capability for humans and other intelligent animals to interact with the physical world. In this work, we propose to endow an artificial agent with the capability of causal reasoning for completing goal-directed tasks. We develop learning-based approaches to inducing causal knowledge in the form of directed acyclic graphs, which can be used to contextualize a learned goal-conditional policy to perform tasks in novel environments with latent causal structures. We leverage attention mechanisms in our causal induction model and goal-conditional policy, enabling us to incrementally generate the causal graph from the agent's visual observations and to selectively use the induced graph for determining actions. Our experiments show that our method effectively generalizes towards completing new tasks in novel environments with previously unseen causal structures.

[1]  A. Zellner Causality and econometrics , 1979 .

[2]  Benjamin Kuipers,et al.  Causal Reasoning in Medicine: Analysis of a Protocol , 1984 .

[3]  R. Corrigan,et al.  Causal Understanding as a Developmental Primitive. , 1996 .

[4]  A. S. Yee The causal effects of ideas on policies , 1996, International Organization.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Bob Rehder,et al.  Categorization as causal reasoning , 2003, Cogn. Sci..

[7]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[8]  Michael R. Waldmann,et al.  Causal Reasoning in Rats , 2006, Science.

[9]  F. Keil Explanation and understanding. , 2006, Annual review of psychology.

[10]  L. Schulz,et al.  Serious fun: preschoolers engage in more exploratory play when evidence is confounded. , 2007, Developmental psychology.

[11]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[12]  A. H. Taylor,et al.  Do New Caledonian crows solve physical problems through causal reasoning? , 2009, Proceedings of the Royal Society B: Biological Sciences.

[13]  Geoffrey J. Gordon,et al.  No-Regret Reductions for Imitation Learning and Structured Prediction , 2010, ArXiv.

[14]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[16]  Daniel A. Braun,et al.  Generalized Thompson sampling for sequential decision-making and causal inference , 2013, Complex Adapt. Syst. Model..

[17]  Pietro Perona,et al.  Visual Causal Feature Learning , 2014, UAI.

[18]  Elias Bareinboim,et al.  Bandits with Unobserved Confounders: A Causal Approach , 2015, NIPS.

[19]  Tom Schaul,et al.  Universal Value Function Approximators , 2015, ICML.

[20]  Yan Liu,et al.  Causal Phenotype Discovery via Deep Networks , 2015, AMIA.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Sergey Levine,et al.  Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments , 2015, ArXiv.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[25]  Elias Bareinboim,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[26]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[27]  Greg Turk,et al.  Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.

[28]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[29]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[30]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[31]  Katja Hofmann,et al.  Meta Reinforcement Learning with Latent Variable Gaussian Processes , 2018, UAI.

[32]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Alexandros G. Dimakis,et al.  CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training , 2017, ICLR.

[34]  Pieter Abbeel,et al.  Learning Plannable Representations with Causal InfoGAN , 2018, NeurIPS.

[35]  Structural Agnostic Modeling: Adversarial Learning of Causal Graphs , 2018, 1803.04929.

[36]  David Lopez-Paz,et al.  SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning , 2018 .

[37]  Song-Chun Zhu,et al.  Human Causal Transfer: Challenges for Deep Reinforcement Learning , 2018, CogSci.

[38]  Doina Bucur,et al.  Causal Discovery with Attention-Based Convolutional Neural Networks , 2019, Mach. Learn. Knowl. Extr..

[39]  Mo Yu,et al.  DAG-GNN: DAG Structure Learning with Graph Neural Networks , 2019, ICML.

[40]  Abhinav Gupta,et al.  Environment Probing Interaction Policies , 2019, ICLR.

[41]  Nicolas Heess,et al.  Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search , 2018, ICLR.

[42]  Zeb Kurth-Nelson,et al.  Causal Reasoning from Meta-reinforcement Learning , 2019, ArXiv.

[43]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[44]  Tristan Deleu,et al.  Gradient-Based Neural DAG Learning , 2019, ICLR.