Joint Space Control via Deep Reinforcement Learning

The dominant way to control a robot manipulator uses hand-crafted differential equations leveraging some form of inverse kinematics / dynamics. We propose a simple, versatile joint-level controller that dispenses with differential equations entirely. A deep neural network, trained via model-free reinforcement learning, is used to map from task space to joint space. Experiments show the method capable of achieving similar error to traditional methods, while greatly simplifying the process by automatically handling redundancy, joint limits, and acceleration / deceleration profiles. The basic technique is extended to avoid obstacles by augmenting the input to the network with information about the nearest obstacles. Results are shown both in simulation and on a real robot via sim-to-real transfer of the learned policy. We show that it is possible to achieve sub-centimeter accuracy, both in simulation and the real world, with a moderate amount of training.

[1]  Sergey Levine,et al.  Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017 .

[2]  Jeannette Bohg,et al.  Concept2Robot: Learning manipulation concepts from instructions and human demonstrations , 2020, Robotics: Science and Systems.

[3]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[4]  Pieter Abbeel,et al.  Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[5]  Paulo Tabuada,et al.  Control Barrier Functions: Theory and Applications , 2019, 2019 18th European Control Conference (ECC).

[6]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[7]  Kenneth Kreutz-Delgado,et al.  Learning Global Direct Inverse Kinematics , 1991, NIPS.

[8]  Dieter Fox,et al.  Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[10]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[11]  Pierre-Yves Oudeyer,et al.  Automatic Curriculum Learning For Deep RL: A Short Survey , 2020, IJCAI.

[12]  Matthew E. Taylor,et al.  Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey , 2020, J. Mach. Learn. Res..

[13]  Jun Nakanishi,et al.  Operational Space Control: A Theoretical and Empirical Comparison , 2008, Int. J. Robotics Res..

[14]  Dieter Fox,et al.  GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning , 2018, CoRL.

[15]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[16]  Byron Boots,et al.  RMPflow: A Computational Graph for Automatic Motion Policy Generation , 2018, WAFR.

[17]  Stefan Schaal,et al.  Learning Operational Space Control , 2006, Robotics: Science and Systems.

[18]  Jochen J. Steil,et al.  Learning Forward and Inverse Kinematics Maps Efficiently , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[20]  Helge J. Ritter,et al.  Three-dimensional neural net for learning visuomotor coordination of a robot arm , 1990, IEEE Trans. Neural Networks.

[21]  Marc D. Killpack,et al.  Comparing Model Predictive Control and input shaping for improved response of low-impedance robots , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[22]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[23]  Silvio Savarese,et al.  Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[24]  James Bergstra,et al.  Benchmarking Reinforcement Learning Algorithms on Real-World Robots , 2018, CoRL.

[25]  Stefan Schaal,et al.  Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[26]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[27]  Terence D. Sanger,et al.  Neural network learning control of robot manipulators using gradually increasing task difficulty , 1994, IEEE Trans. Robotics Autom..

[28]  Sergey Levine,et al.  Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017, ArXiv.

[29]  Silvio Savarese,et al.  Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[30]  Bernhard Schölkopf,et al.  Learning inverse kinematics with structured prediction , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Stefan Schaal,et al.  Real-Time Perception Meets Reactive Motion Generation , 2017, IEEE Robotics and Automation Letters.

[32]  Kai A. Krueger,et al.  Flexible shaping: How learning in small steps helps , 2009, Cognition.

[33]  Francisco J. Rodríguez Lera,et al.  Reinforcement Learning Experiments and Benchmark for Solving Robotic Reaching Tasks , 2020, WAF.

[34]  Stefan Schaal,et al.  Learning to Control in Operational Space , 2008, Int. J. Robotics Res..

[35]  Stefan Schaal,et al.  Reinforcement Learning for Operational Space Control , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[36]  Daniel Kappler,et al.  Riemannian Motion Policies , 2018, ArXiv.

[37]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[39]  Lydia E. Kavraki,et al.  How Much Do Unstated Problem Constraints Limit Deep Robotic Reinforcement Learning? , 2019, ArXiv.

[40]  Yong Jae Lee,et al.  Learning the easy things first: Self-paced visual category discovery , 2011, CVPR 2011.

[41]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[42]  Gaurav S. Sukhatme,et al.  Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[43]  Jeannette Bohg,et al.  Object-Centric Task and Motion Planning in Dynamic Environments , 2020, IEEE Robotics and Automation Letters.

[44]  Jeannette Bohg,et al.  Learning to Scaffold the Development of Robotic Manipulation Skills , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[45]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[46]  Carlos Canudas de Wit,et al.  Theory of Robot Control , 1996 .

[47]  Sha Luo,et al.  Accelerating Reinforcement Learning for Reaching Using Continuous Curriculum Learning , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[48]  P. Patrick Van Der Smagt,et al.  A real-time learning neural robot controller , 1991 .

[49]  Yi Zhou,et al.  On the Continuity of Rotation Representations in Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Sergey Levine,et al.  Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.

[51]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[52]  Marc D. Killpack,et al.  Comparison of linearized dynamic robot manipulator models for model predictive control , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[53]  John Schulman,et al.  Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[54]  Daphna Weinshall,et al.  On The Power of Curriculum Learning in Training Deep Networks , 2019, ICML.

[55]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[56]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.