Model Identification via Physics Engines for Improved Policy Search

This paper presents a practical approach for identifying unknown mechanical parameters, such as mass and friction models of manipulated rigid objects or actuated robotic links, in a succinct manner that aims to improve the performance of policy search algorithms. Key features of this approach are the use of off-the-shelf physics engines and the adaptation of a black-box Bayesian optimization framework for this purpose. The physics engine is used to reproduce in simulation experiments that are performed on a real robot, and the mechanical parameters of the simulated system are automatically fine-tuned so that the simulated trajectories match with the real ones. The optimized model is then used for learning a policy in simulation, before safely deploying it on the real robot. Given the well-known limitations of physics engines in modeling real-world objects, it is generally not possible to find a mechanical model that reproduces in simulation the real trajectories exactly. Moreover, there are many scenarios where a near-optimal policy can be found without having a perfect knowledge of the system. Therefore, searching for a perfect model may not be worth the computational effort in practice. The proposed approach aims then to identify a model that is good enough to approximate the value of a locally optimal policy with a certain confidence, instead of spending all the computational resources on searching for the most accurate model. Empirical evaluations, performed in simulation and on a real robotic manipulation task, show that model identification via physics engines can significantly boost the performance of policy search algorithms that are popular in robotics, such as TRPO, PoWER and PILCO, with no additional real-world data.

[1]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[2]  Jiajun Wu,et al.  A Comparative Evaluation of Approximate Probabilistic Simulation and Deep Neural Networks as Accounts of Human Physical Scene Understanding , 2016, CogSci.

[3]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[4]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[5]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[6]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[7]  Mario Fritz,et al.  Visual Stability Prediction and Its Application to Manipulation , 2016, AAAI Spring Symposia.

[8]  Yuval Tassa,et al.  Simulation tools for model-based robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[10]  Jan Peters,et al.  Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.

[11]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[12]  Kuan-Ting Yu,et al.  Shape and pose recovery from planar pushing , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[14]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[15]  Dieter Fox,et al.  SE3-nets: Learning rigid body motion using deep neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[17]  Siddhartha S. Srinivasa,et al.  Physics-Based Grasp Planning Through Clutter , 2012, Robotics: Science and Systems.

[18]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[19]  Andreas Krause,et al.  Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Lennart Ljung,et al.  System identification (2nd ed.): theory for the user , 1999 .

[21]  Noah D. Goodman,et al.  Learning physics from dynamical scenes , 2014 .

[22]  Rob Fergus,et al.  Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[23]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[24]  Joshua B. Tenenbaum,et al.  A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.

[25]  Misha Denil,et al.  Learning to Perform Physics Experiments via Deep Reinforcement Learning , 2016, ICLR.

[26]  Manuela M. Veloso,et al.  Push-manipulation of complex passive mobile objects using experimentally acquired motion models , 2015, Auton. Robots.

[27]  Abhinav Gupta,et al.  The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.

[28]  Jan Peters,et al.  Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.

[29]  Greg Turk,et al.  Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.

[30]  Mario Fritz,et al.  To Fall Or Not To Fall: A Visual Approach to Physical Stability Prediction , 2016, ArXiv.

[31]  Kevin M. Lynch,et al.  Stable Pushing: Mechanics, Controllability, and Planning , 1995, Int. J. Robotics Res..

[32]  Jessica B. Hamrick,et al.  Inferring mass in complex scenes by mental simulation , 2016, Cognition.

[33]  David Wingate,et al.  A Physics-Based Model Prior for Object-Oriented MDPs , 2014, ICML.

[34]  Auke Jan Ijspeert,et al.  Central pattern generators for locomotion control in animals and robots: A review , 2008, Neural Networks.

[35]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[36]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[37]  J. Andrew Bagnell,et al.  A convex polynomial force-motion model for planar sliding: Identification and application , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).