Sim2Real Transfer for Reinforcement Learning without Dynamics Randomization

In this work we show how to use the Operational Space Control framework (OSC) under joint and cartesian constraints for reinforcement learning in cartesian space. Our method is therefore able to learn fast and with adjustable degrees of freedom, while we are able to transfer policies without additional dynamics randomizations on a KUKA LBR iiwa peg in-hole task. Before learning in simulation starts, we perform a system identification for aligning the simulation environment as far as possible with the dynamics of a real robot. Adding constraints to the OSC controller allows us to learn in a safe way on the real robot or to learn a flexible, goal conditioned policy that can be easily transferred from simulation to the real robot.

[1]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[2]  Silvio Savarese,et al.  SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark , 2018, CoRL.

[3]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[4]  Zhang-Wei Hong,et al.  Virtual-to-Real: Learning to Control in Visual Semantic Segmentation , 2018, IJCAI.

[5]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[6]  Silvio Savarese,et al.  Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[7]  Giovanni De Magistris,et al.  Deep reinforcement learning for high precision assembly tasks , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Roland Siegwart,et al.  Flexible Robotic Grasping with Sim-to-Real Transfer based Reinforcement Learning , 2018, ArXiv.

[9]  Geoffrey J. Gordon,et al.  No-Regret Reductions for Imitation Learning and Structured Prediction , 2010, ArXiv.

[10]  Rüdiger Dillmann,et al.  Contact Skill Imitation Learning for Robot-Independent Assembly Programming , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Alan Sullivan,et al.  Sim-to-Real Transfer Learning using Robustified Controllers in Robotic Tasks involving Complex Dynamics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[12]  Stephen Tyree,et al.  Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control , 2017, ArXiv.

[13]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[14]  Pierre-Yves Oudeyer,et al.  Sim-to-Real Transfer with Neural-Augmented Robot Simulation , 2018, CoRL.

[15]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[16]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[17]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[18]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Uwe E. Zimmermann,et al.  Physical Human-Robot Interaction under Joint and Cartesian Constraints , 2019, 2019 19th International Conference on Advanced Robotics (ICAR).

[20]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[21]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[22]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[23]  Jaeheung Park,et al.  Robot Control near Singularity and Joint Limit Using a Continuous Task Transition Algorithm , 2013 .

[24]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[25]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.