Learning dexterous manipulation for a soft robotic hand from human demonstrations

Dexterous multi-fingered hands can accomplish fine manipulation behaviors that are infeasible with simple robotic grippers. However, sophisticated multi-fingered hands are often expensive and fragile. Low-cost soft hands offer an appealing alternative to more conventional devices, but present considerable challenges in sensing and actuation, making them difficult to apply to more complex manipulation tasks. In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks. Our method uses object-centric demonstrations, where a human demonstrates the desired motion of manipulated objects with their own hands, and the robot autonomously learns to imitate these demonstrations using reinforcement learning. We propose a novel algorithm that allows us to blend and select a subset of the most feasible demonstrations, which we use with an extension of the guided policy search framework that learns generalizable neural network policies. We demonstrate our approach on the RBO Hand 2, with learned motor skills for turning a valve, manipulating an abacus, and grasping.

[1]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[2]  A. Kapandji Cotation clinique de l'opposition et de la contre-opposition du pouce , 1986 .

[3]  Jeffrey C. Trinkle,et al.  Dextrous manipulation by rolling and finger gaiting , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[4]  Kamal K. Gupta,et al.  Planning quasi-static fingertip manipulations for reconfiguring objects , 1999, IEEE Trans. Robotics Autom..

[5]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[6]  T. Mouri,et al.  Anthropomorphic Robot Hand : Gifu Hand III , 2002 .

[7]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[8]  Tamim Asfour,et al.  Imitation Learning of Dual-Arm Manipulation Tasks in Humanoid Robots , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[9]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[10]  Aude Billard,et al.  Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[11]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[12]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[14]  Suguru Arimoto,et al.  Dynamic object manipulation using a virtual frame by a triple soft-fingered robotic hand , 2010, 2010 IEEE International Conference on Robotics and Automation.

[15]  Stefan Schaal,et al.  Learning Policy Improvements with Path Integrals , 2010, AISTATS.

[16]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[17]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[18]  Alin Albu-Schäffer,et al.  The DLR hand arm system , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19]  Stefan Schaal,et al.  Learning force control policies for compliant manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Oliver Kroemer,et al.  Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[21]  Alexandre Bernardino,et al.  Modeling and planning high-level in-hand manipulation actions from human knowledge and active learning from demonstration , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Aude Billard,et al.  Iterative learning of grasp adaptation through human corrections , 2012, Robotics Auton. Syst..

[23]  Zoran Popovic,et al.  Contact-invariant optimization for hand manipulation , 2012, SCA '12.

[24]  Helge J. Ritter,et al.  Rotary object dexterous manipulation in hand: a feedback-based method , 2013, Int. J. Mechatronics Autom..

[25]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[26]  Martin A. Riedmiller,et al.  Acquiring visual servoing reaching and grasping skills using neural reinforcement learning , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[27]  C. Karen Liu,et al.  Dexterous manipulation using both palm and fingers , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[29]  Jan Peters,et al.  Learning robot in-hand manipulation with tactile features , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[30]  Robert J. Wood,et al.  Modeling of Soft Fiber-Reinforced Bending Actuators , 2015, IEEE Transactions on Robotics.

[31]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Oliver Kroemer,et al.  Towards learning hierarchical skills for multi-phase manipulation tasks , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Sergey Levine,et al.  Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Oliver Brock,et al.  A novel type of compliant and underactuated robotic hand for dexterous grasping , 2016, Int. J. Robotics Res..

[35]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..