Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators. However, such hands pose a major challenge for autonomous control, due to the high dimensionality of their configuration space and complex intermittent contact interactions. In this work, we propose deep reinforcement learning (deep RL) as a scalable solution for learning complex, contact rich behaviors with multi-fingered hands. Deep RL provides an end-to-end approach to directly map sensor readings to actions, without the need for task specific models or policy classes. We show that contact-rich manipulation behavior with multi-fingered hands can be learned by directly training with model-free deep RL algorithms in the real world, with minimal additional assumption and without the aid of simulation. We learn to perform a variety of tasks on two different low-cost hardware platforms entirely from scratch, and further study how the learning can be accelerated by using a small number of human demonstrations. Our experiments demonstrate that complex multi-fingered manipulation skills can be learned in the real world in about 4-7 hours for most tasks, and that demonstrations can decrease this to 2-3 hours, indicating that direct deep RL training in the real world is a viable and practical alternative to simulation and model-based control. https:// sites.google.com/view/deeprl-handmanipulation

[1]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[2]  Allison M. Okamura,et al.  An overview of dexterous manipulation , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[3]  Vijay Kumar,et al.  Robotic grasping and contact: a review , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[4]  Hong Liu,et al.  DLR-Hand II: next generation of a dextrous robot hand , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[5]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[6]  Jun Wang,et al.  Synthesis of force-closure grasps on 3-D objects based on the Q distance , 2003, IEEE Trans. Robotics Autom..

[7]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[8]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[9]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[10]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[11]  Zoran Popovic,et al.  Contact-invariant optimization for hand manipulation , 2012, SCA '12.

[12]  Aaron M. Dollar,et al.  A Hand-Centric Classification of Human and Robot Dexterous Manipulation , 2013, IEEE Transactions on Haptics.

[13]  Vikash Kumar,et al.  Fast, strong and compliant pneumatic actuation for dexterous tendon-driven hands , 2013, 2013 IEEE International Conference on Robotics and Automation.

[14]  Yuval Tassa,et al.  Real-time behaviour synthesis for dynamic hand-manipulation , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[15]  C. Karen Liu,et al.  Dexterous manipulation using both palm and fingers , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[17]  Jan Peters,et al.  Learning robot in-hand manipulation with tactile features , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[18]  Sergey Levine,et al.  Learning dexterous manipulation for a soft robotic hand from human demonstrations , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Sergey Levine,et al.  Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Martin A. Riedmiller,et al.  Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[21]  Andrew J. Davison,et al.  Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task , 2017, CoRL.

[22]  Balaraman Ravindran,et al.  EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[23]  Sham M. Kakade,et al.  Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.

[24]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[25]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[26]  Sergey Levine,et al.  Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017, ArXiv.

[27]  Marcin Andrychowicz,et al.  Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[29]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[30]  Tom Schaul,et al.  Deep Q-learning From Demonstrations , 2017, AAAI.

[31]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..