Causal Discovery in Physical Systems from Videos

Causal discovery is at the core of human cognition. It enables us to reason about the environment and make counterfactual predictions about unseen scenarios, that can vastly differ from our previous experiences. We consider the task of causal discovery from videos in an end-to-end fashion without supervision on the ground-truth graph structure. In particular, our goal is to discover the structural dependencies among environmental and object variables: inferring the type and strength of interactions that have a causal effect on the behavior of the dynamical system. Our model consists of (a) a perception module that extracts a semantically meaningful and temporally consistent keypoint representation from images, (b) an inference module for determining the graph distribution induced by the detected keypoints, and (c) a dynamics module that can predict the future by conditioning on the inferred graph. We assume access to different configurations and environmental conditions, i.e., data from unknown interventions on the underlying system; thus, we can hope to discover the correct underlying causal graph without explicit interventions. We evaluate our method in a planar multi-body interaction environment and scenarios involving fabrics of different shapes like shirts and pants. Experiments demonstrate that our model can correctly identify the interactions from a short sequence of images and make long-term future predictions. The causal structure assumed by the model also allows it to make counterfactual predictions and extrapolate to systems of unseen interaction graphs or graphs of various sizes.

[1]  Russ Tedrake,et al.  Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning , 2020, CoRL.

[2]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[3]  Elias Bareinboim,et al.  Budgeted Experiment Design for Causal Structure Learning , 2017, ICML.

[4]  Jiajun Wu,et al.  Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids , 2018, ICLR.

[5]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[6]  I. Guyon,et al.  Causal Generative Neural Networks , 2017, 1711.08936.

[7]  Abhinav Gupta,et al.  Compositional Video Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[9]  Jure Leskovec,et al.  Learning to Simulate Complex Physics with Graph Networks , 2020, ICML.

[10]  Mohammad Norouzi,et al.  Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.

[11]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[12]  David Lopez-Paz,et al.  SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning , 2018 .

[13]  Tae-Yong Kim,et al.  Unified particle physics for real-time applications , 2014, ACM Trans. Graph..

[14]  Daniel L. K. Yamins,et al.  Visual Grounding of Learned Physical Models , 2020, ICML.

[15]  Jonas Peters,et al.  BACKSHIFT: Learning causal cyclic graphs from unknown shift interventions , 2015, NIPS.

[16]  Daniel L. K. Yamins,et al.  Flexible Neural Representation for Physics Prediction , 2018, NeurIPS.

[17]  Bernhard Schölkopf,et al.  Causal Discovery from Temporally Aggregated Time Series , 2017, UAI.

[18]  Jiajun Wu,et al.  Propagation Networks for Model-Based Control Under Partial Observation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[19]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[20]  Aysegul Dundar,et al.  Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Pradeep Ravikumar,et al.  DAGs with NO TEARS: Continuous Optimization for Structure Learning , 2018, NeurIPS.

[22]  Ankush Gupta,et al.  Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.

[23]  Yuhao Wang,et al.  Permutation-based Causal Inference Algorithms with Interventions , 2017, NIPS.

[24]  Ankush Gupta,et al.  Unsupervised Learning of Object Landmarks through Conditional Image Generation , 2018, NeurIPS.

[25]  Shohei Shimizu,et al.  Lingam: Non-Gaussian Methods for Estimating Causal Structures , 2014, Behaviormetrika.

[26]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[27]  Jiajun Wu,et al.  Learning Compositional Koopman Operators for Model-Based Control , 2020, ICLR.

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  Frederick Eberhardt,et al.  Experiment selection for causal discovery , 2013, J. Mach. Learn. Res..

[30]  Vladlen Koltun,et al.  Lagrangian Fluid Simulation with Continuous Convolutions , 2020, ICLR.

[31]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[32]  Thomas S. Richardson,et al.  Learning high-dimensional DAGs with latent and selection variables (Abstract) , 2011, UAI.

[33]  Sergey Levine,et al.  VideoFlow: A Flow-Based Generative Model for Video , 2019, ArXiv.

[34]  P. Spirtes,et al.  Review of Causal Discovery Methods Based on Graphical Models , 2019, Front. Genet..

[35]  Juan Carlos Niebles,et al.  Learning to Decompose and Disentangle Representations for Video Prediction , 2018, NeurIPS.

[36]  Karthikeyan Shanmugam,et al.  Experimental Design for Learning Causal Graphs with Latent Variables , 2017, NIPS.

[37]  Alexandros G. Dimakis,et al.  Learning Causal Graphs with Small Interventions , 2015, NIPS.

[38]  S. Levine,et al.  Reasoning About Physical Interactions with Object-Centric Models , 2018 .

[39]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[40]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[41]  P. Hoyer,et al.  On Causal Discovery from Time Series Data using FCI , 2010 .

[42]  Nan Rosemary Ke,et al.  Learning Neural Causal Models from Unknown Interventions , 2019, ArXiv.

[43]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[44]  Bernhard Schölkopf,et al.  Recurrent Independent Mechanisms , 2021, ICLR.

[45]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[46]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[47]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[48]  Jonathan Tompson,et al.  Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning , 2018, NeurIPS.

[49]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[50]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[51]  Yuting Zhang,et al.  Unsupervised Discovery of Object Landmarks as Structural Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Structural Agnostic Modeling: Adversarial Learning of Causal Graphs , 2018, 1803.04929.

[53]  Leslie Pack Kaelbling,et al.  Neural Relational Inference with Fast Modular Meta-learning , 2019, NeurIPS.

[54]  Mingming Gong,et al.  Causal Discovery in the Presence of Measurement Error: Identifiability Conditions , 2017, ArXiv.

[55]  Razvan Pascanu,et al.  Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[56]  Chuang Gan,et al.  CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2020, ICLR.

[57]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[58]  Wei Gao,et al.  kPAM: KeyPoint Affordances for Category-Level Robotic Manipulation , 2019, ISRR.

[59]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.