Graph networks as learnable physics engines for inference and control

Understanding and interacting with everyday physical scenes requires rich knowledge about the structure of the world, represented either implicitly in a value or policy function, or explicitly in a transition model. Here we introduce a new class of learnable models--based on graph networks--which implement an inductive bias for object- and relation-centric representations of complex, dynamical systems. Our results show that as a forward model, our approach supports accurate predictions from real and simulated data, and surprisingly strong and efficient generalization, across eight distinct physical systems which we varied parametrically and structurally. We also found that our inference model can perform system identification. Our models are also differentiable, and support online planning via gradient-based trajectory optimization, as well as offline policy optimization. Our framework offers new opportunities for harnessing and exploiting rich knowledge about the world, and takes a key step toward building machines with more human-like representations of the world.

[1]  K. J. Craik,et al.  The nature of explanation , 1944 .

[2]  Philip N. Johnson-Laird,et al.  Mental Models in Cognitive Science , 1980, Cogn. Sci..

[3]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[4]  Daniel M. Wolpert,et al.  Forward Models for Physiological Motor Control , 1996, Neural Networks.

[5]  Christopher G. Atkeson,et al.  A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[6]  Geoffrey E. Hinton,et al.  NeuroAnimator: fast neural network emulation and control of physics-based models , 1998, SIGGRAPH.

[7]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[8]  Ah Chung Tsoi,et al.  Graph neural networks for ranking Web pages , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[9]  Katherine D. Kinzler,et al.  Core knowledge. , 2007, Developmental science.

[10]  William D. Smart,et al.  Receding Horizon Differential Dynamic Programming , 2007, NIPS.

[11]  Ah Chung Tsoi,et al.  Computational Capabilities of Graph Neural Networks , 2009, IEEE Transactions on Neural Networks.

[12]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[13]  Yi Sun,et al.  Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments , 2011, AGI.

[14]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[15]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[17]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[18]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[20]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[21]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[22]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[23]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[24]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[25]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[26]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[27]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[28]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[29]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[30]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[31]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[32]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[33]  Sergey Levine,et al.  Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[34]  Filip De Turck,et al.  Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks , 2016, ArXiv.

[35]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[36]  Razvan Pascanu,et al.  Metacontrol for Adaptive Imagination-Based Optimization , 2017, ICLR.

[37]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[38]  Razvan Pascanu,et al.  Discovering objects and their relations from entangled scene representations , 2017, ICLR.

[39]  Razvan Pascanu,et al.  Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.

[40]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[41]  Greg Turk,et al.  Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.

[42]  Balaraman Ravindran,et al.  EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[43]  Niloy J. Mitra,et al.  Learning A Physical Long-term Predictor , 2017, ArXiv.

[44]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[45]  Razvan Pascanu,et al.  Learning model-based planning from scratch , 2017, ArXiv.

[46]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[47]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Sanja Fidler,et al.  NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.

[49]  Misha Denil,et al.  Learning Awareness Models , 2018, ICLR.

[50]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[51]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).