End-to-End Differentiable Physics for Learning and Control

We present a differentiable physics engine that can be integrated as a module in deep neural networks for end-to-end learning. As a result, structured physics knowledge can be embedded into larger systems, allowing them, for example, to match observations by performing precise simulations, while achieves high sample efficiency. Specifically, in this paper we demonstrate how to perform backpropagation analytically through a physical simulator defined via a linear complementarity problem. Unlike traditional finite difference methods, such gradients can be computed analytically, which allows for greater flexibility of the engine. Through experiments in diverse domains, we highlight the system's ability to learn physical parameters from data, efficiently match and simulate observed visual behavior, and readily enable control via gradient-based planning methods. Code for the engine and experiments is included with the paper.

[1]  Roy Featherstone,et al.  Robot Dynamics Algorithms , 1987 .

[2]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[3]  Paul J. Werbos,et al.  Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[4]  Christopher G. Atkeson,et al.  A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[5]  M. Anitescu,et al.  Formulating Dynamic Multi-Rigid-Body Contact Problems with Friction as Solvable Linear Complementarity Problems , 1997 .

[6]  Michael Bradley Cline Rigid Body Simulation with Contact and Constraints , 2002 .

[7]  Gino van den Bergen Collision Detection in Interactive 3D Environments , 2003 .

[8]  Pieter Abbeel,et al.  Learning first-order Markov models for control , 2004, NIPS.

[9]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[10]  S. Sra,et al.  Matrix Differential Calculus , 2005 .

[11]  Richard W. Cottle,et al.  Linear Complementarity Problem , 2009, Encyclopedia of Optimization.

[12]  Stephen P. Boyd,et al.  CVXGEN: a code generator for embedded convex optimization , 2011, Optimization and Engineering.

[13]  Kevin A. Smith,et al.  Sources of uncertainty in intuitive physics , 2012, CogSci.

[14]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[16]  Benjamin Schrauwen,et al.  Automated Design of Complex Dynamic Systems , 2014, PloS one.

[17]  Thomas L. Griffiths,et al.  Think again? The amount of mental simulation tracks uncertainty in the outcome , 2015, CogSci.

[18]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[21]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[22]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[23]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[24]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[25]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[26]  Andreas Krause,et al.  Differentiable Learning of Submodular Models , 2017, NIPS 2017.

[27]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[28]  Jonas Degrave,et al.  A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..

[29]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[31]  Pieter Abbeel,et al.  Prediction and Control with Temporal Segment Models , 2017, ICML.

[32]  Pieter Abbeel,et al.  Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.

[33]  J. Zico Kolter,et al.  What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[34]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Siddhartha S. Srinivasa,et al.  DART: Dynamic Animation and Robotics Toolkit , 2018, J. Open Source Softw..

[36]  Arthur Mensch,et al.  Differentiable Dynamic Programming for Structured Prediction and Attention , 2018, ICML.