COPHY: Counterfactual Learning of Physical Dynamics

Understanding causes and effects in mechanical systems is an essential component of reasoning in the physical world. This work poses a new problem of counterfactual learning of object mechanics from visual input. We develop the CoPhy benchmark to assess the capacity of the state-of-the-art models for causal physical reasoning in a synthetic 3D environment and propose a model for learning the physical dynamics in a counterfactual setting. Having observed a mechanical experiment that involves, for example, a falling tower of blocks, a set of bouncing balls or colliding objects, we learn to predict how its outcome is affected by an arbitrary intervention on its initial conditions, such as displacing one of the objects in the scene. The alternative future is predicted given the altered past and a latent representation of the confounders learned by the model in an end-to-end fashion with no supervision. We compare against feedforward video prediction baselines and show how observing alternative experiences allows the network to capture latent physical properties of the environment, which results in significantly more accurate predictions at the level of super human performance.

[1]  M. Oquab,et al.  Revisiting Classifier Two-Sample Tests for GAN Evaluation and Causal Discovery , 2016 .

[2]  Emmanuel Dupoux,et al.  IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning , 2018, ArXiv.

[3]  Felix Hill,et al.  Measuring abstract reasoning in neural networks , 2018, ICML.

[4]  Jiajun Wu,et al.  Learning to See Physics via Visual De-animation , 2017, NIPS.

[5]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[6]  Deva Ramanan,et al.  CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning , 2020, ICLR.

[7]  Peter Spirtes,et al.  Introduction to Causal Inference , 2010, J. Mach. Learn. Res..

[8]  Christian Wolf,et al.  Object Level Visual Reasoning in Videos , 2018, ECCV.

[9]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[10]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[11]  S. Levine,et al.  Reasoning About Physical Interactions with Object-Centric Models , 2018 .

[12]  Bernhard Schölkopf,et al.  Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Chuang Gan,et al.  CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2020, ICLR.

[14]  Mélanie Frappier,et al.  The Book of Why: The New Science of Cause and Effect , 2018, Science.

[15]  M. McCloskey,et al.  Intuitive physics: the straight-down belief and its origin. , 1983, Journal of experimental psychology. Learning, memory, and cognition.

[16]  Andrea Vedaldi,et al.  ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking , 2018, ECCV.

[17]  David Lopez-Paz,et al.  Causal Discovery Using Proxy Variables , 2017, ICLR.

[18]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[19]  Niloy J. Mitra,et al.  Unsupervised Intuitive Physics from Past Experiences , 2019, ArXiv.

[20]  Michael McCloskey,et al.  Naive physics: the curvilinear impetus principle and its role in interactions with moving objects , 1983 .

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Trevor Darrell,et al.  Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Michael McCloskey,et al.  Intuitive physics: the straight-down belief and its origin. , 1983, Journal of experimental psychology. Learning, memory, and cognition.

[24]  James R. Kubricht,et al.  Intuitive Physics: Current Research and Controversies , 2017, Trends in Cognitive Sciences.

[25]  R. C. Oldfield THE PERCEPTION OF CAUSALITY , 1963 .

[26]  Judea Pearl,et al.  Counterfactual Probabilities: Computational Methods, Bounds and Applications , 1994, UAI.

[27]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..

[28]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[29]  Alexandros G. Dimakis,et al.  CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training , 2017, ICLR.

[30]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[31]  Chuang Gan,et al.  The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.

[32]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[33]  Joshua B. Tenenbaum,et al.  How, whether, why: Causal judgments as counterfactual contrasts , 2015, CogSci.

[34]  Aaron C. Courville,et al.  FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[35]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[36]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[37]  Jiajun Wu,et al.  Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks , 2018, UAI.

[38]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[39]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[40]  Judea Pearl,et al.  Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution , 2018, WSDM.

[41]  Yuting Zhang,et al.  Deep Visual Analogy-Making , 2015, NIPS.

[42]  Christopher D. Manning,et al.  Compositional Attention Networks for Machine Reasoning , 2018, ICLR.

[43]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[44]  J. Call,et al.  Tubes, tables and traps: great apes solve two functionally equivalent trap tasks but show no evidence of transfer across tasks , 2008, Animal Cognition.

[45]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[46]  Jürgen Schmidhuber,et al.  Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[47]  M. McCloskey,et al.  Naive physics: the curvilinear impetus principle and its role in interactions with moving objects. , 1983, Journal of experimental psychology. Learning, memory, and cognition.

[48]  Herbert Jaeger Artificial intelligence: Deep neural reasoning , 2016, Nature.

[49]  Li Fei-Fei,et al.  Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Ross B. Girshick,et al.  PHYRE: A New Benchmark for Physical Reasoning , 2019, NeurIPS.

[51]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[52]  A. Caramazza,et al.  Curvilinear motion in the absence of external forces: naive beliefs about the motion of objects. , 1980, Science.

[53]  Stefan Lee,et al.  Embodied Question Answering in Photorealistic Environments With Point Cloud Perception , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Abhinav Gupta,et al.  Interpretable Intuitive Physics Model , 2018, ECCV.