A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS

One of the most important fields in robotics is the optimization of controllers. Currently, robots are often treated as a black box in this optimization process, which is the reason why derivative-free optimization methods such as evolutionary algorithms or reinforcement learning are omnipresent. When gradient-based methods are used, models are kept small or rely on finite difference approximations for the Jacobian. This method quickly grows expensive with increasing numbers of parameters, such as found in deep learning. We propose an implementation of a modern physics engine, which can differentiate control parameters. This engine is implemented for both CPU and GPU. Firstly, this paper shows how such an engine speeds up the optimization process, even for small problems. Furthermore, it explains why this is an alternative approach to deep Q-learning, for using deep learning in robotics. Finally, we argue that this is a big step for deep learning in robotics, as it opens up new possibilities to optimize robots, both in hardware and software.

[1]  Hod Lipson,et al.  Resilient Machines Through Continuous Self-Modeling , 2006, Science.

[2]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[3]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[4]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[5]  Friedl De Groote,et al.  Algorithmic differentiation improves the computational efficiency of OpenSim-based trajectory optimization of human movement , 2019, PloS one.

[6]  F. Jourdan,et al.  A Gauss-Seidel like algorithm to solve frictional contact problems , 1998 .

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[9]  Emanuel Todorov,et al.  Combining the benefits of function approximation and trajectory optimization , 2014, Robotics: Science and Systems.

[10]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[13]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[14]  Sergey Levine,et al.  Variational Policy Search via Trajectory Optimization , 2013, NIPS.

[15]  Hod Lipson,et al.  Evolved Electrophysiological Soft Robots , 2014, ALIFE.

[16]  A. Iollo,et al.  An aerodynamic optimization method based on the inverse problem adjoint equations , 2001 .

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Carlos Aguilar-Ibanez,et al.  Stabilization of the PVTOL aircraft based on a sliding mode and a saturation function , 2017 .

[19]  J. A. Meda Estimation of complex systems with parametric uncertainties using a JSSF heuristically adjusted. , 2018 .

[20]  R. Vaccaro Digital control : a state-space approach , 1995 .

[21]  Rico Möckel,et al.  Towards dynamically running quadruped robots: performance, scaling, and comparison , 2013 .

[22]  Jeffrey C. Trinkle,et al.  An implicit time-stepping scheme for rigid body dynamics with Coulomb friction , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[23]  Jesus Alberto Meda-Campana,et al.  Estimation of complex systems with parametric uncertainties using a JSSF heuristically adjusted. , 2018, IEEE Latin America Transactions.

[24]  Geoffrey E. Hinton,et al.  NeuroAnimator: fast neural network emulation and control of physics-based models , 1998, SIGGRAPH.

[25]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[26]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[27]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[28]  Rolf Pfeifer,et al.  How the body shapes the way we think - a new view on intelligence , 2006 .

[29]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[30]  Jonas Degrave A Differentiable Physics Engine for Deep Learning , 2016 .

[31]  Geoffrey E. Hinton,et al.  Training Recurrent Neural Networks , 2013 .

[32]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.

[33]  Lennart Ljung,et al.  Nonlinear black-box modeling in system identification: a unified overview , 1995, Autom..

[34]  Mohamed Chtourou,et al.  On the training of recurrent neural networks , 2011, Eighth International Multi-Conference on Systems, Signals & Devices.

[35]  Jonathon Shlens,et al.  A Learned Representation For Artistic Style , 2016, ICLR.

[36]  Benjamin Schrauwen,et al.  Automated Design of Complex Dynamic Systems , 2014, PloS one.

[37]  Karl Sims,et al.  Evolving 3D Morphology and Behavior by Competition , 1994, Artificial Life.

[38]  Sander Dieleman,et al.  Spatial Chirp-Z Transformer Networks , 2016, ESANN.

[39]  Jonas Degrave,et al.  Transfer learning of gaits on a quadrupedal robot , 2015, Adapt. Behav..

[40]  Brian Mirtich,et al.  V-Clip: fast and robust polyhedral collision detection , 1998, TOGS.

[41]  Benjamin Schrauwen,et al.  Comparing trotting and turning strategies on the quadrupedal oncilla robot , 2013, 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[42]  Philip M. Hubbard,et al.  Approximating polyhedra with spheres for time-critical collision detection , 1996, TOGS.

[43]  Yvon Jarny,et al.  A General Optimization Method using Adjoint Equation for Solving Multidimensional Inverse Heat Conduction , 1991 .

[44]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[45]  Yuval Tassa,et al.  Simulation tools for model-based robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).