Learning dexterous in-hand manipulation

We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies that can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we randomize many of the physical properties of the system such as friction coefficients and an object’s appearance. Our policies transfer to the physical robot despite being trained entirely in simulation. Our method does not rely on any human demonstrations, but many behaviors found in human manipulation emerge naturally, including finger gaiting, multi-finger coordination, and the controlled use of gravity. Our results were obtained using the same distributed RL system that was used to train OpenAI Five. We also include a video of our results: https://youtu.be/jwSbzNHGflM.

[1]  T. Mckeown Mechanics , 1970, The Mathematics of Fluid Flow Through Porous Media.

[2]  Ronald S. Fearing,et al.  Implementing a force strategy for object re-orientation , 1986, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[3]  Matthew T. Mason,et al.  An exploration of sensorless manipulation , 1986, IEEE J. Robotics Autom..

[4]  Hirochika Inoue,et al.  Tumbling Objects Using a Multi-fingered Robot , 1991 .

[5]  Daniela Rus,et al.  Dexterous rotations of polyhedra , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[6]  Masayuki Inaba,et al.  Pivoting: A new method of graspless manipulation of object by robot fingers , 1993, Proceedings of 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '93).

[7]  Antonio Bicchi,et al.  Dexterous manipulation through rolling , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[8]  Michael A. Erdmann,et al.  Stably supported rotations of a planar polygon with two frictionless contacts , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[9]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[10]  Jeffrey C. Trinkle,et al.  Dextrous manipulation with rolling contacts , 1997, Proceedings of International Conference on Robotics and Automation.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  C. Pehoski,et al.  In-hand manipulation in young children: rotation of an object in the fingers. , 1997, The American journal of occupational therapy : official publication of the American Occupational Therapy Association.

[13]  Michael A. Erdmann,et al.  An Exploration of Nonprehensile Two-Palm Manipulation , 1998, Int. J. Robotics Res..

[14]  Jeffrey C. Trinkle,et al.  Dextrous manipulation by rolling and finger gaiting , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[15]  Kamal K. Gupta,et al.  Planning quasi-static fingertip manipulations for reconfiguring objects , 1999, IEEE Trans. Robotics Autom..

[16]  Daniela Rus,et al.  In-Hand Dexterous Manipulation of Piecewise-Smooth 3-D Objects , 1999, Int. J. Robotics Res..

[17]  Allison M. Okamura,et al.  An overview of dexterous manipulation , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[18]  Antonio Bicchi,et al.  Hands for dexterous manipulation and robust grasping: a difficult road toward simplicity , 2000, IEEE Trans. Robotics Autom..

[19]  Matthew T. Mason,et al.  Mechanics, Planning, and Control for Tapping , 1998, Int. J. Robotics Res..

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Victor M. Becerra,et al.  Optimal control , 2008, Scholarpedia.

[22]  Suguru Arimoto,et al.  Dynamic object manipulation using a virtual frame by a triple soft-fingered robotic hand , 2010, 2010 IEEE International Conference on Robotics and Automation.

[23]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[24]  Aaron M. Dollar,et al.  On dexterity and dexterous manipulation , 2011, 2011 15th International Conference on Advanced Robotics (ICAR).

[25]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Zoran Popovic,et al.  Contact-invariant optimization for hand manipulation , 2012, SCA '12.

[28]  Helge J. Ritter,et al.  Rotary object dexterous manipulation in hand: a feedback-based method , 2013, Int. J. Mechatronics Autom..

[29]  Zoe Doulgeri,et al.  On rolling contact motion by robotic fingers via prescribed performance control , 2013, 2013 IEEE International Conference on Robotics and Automation.

[30]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[31]  C. Karen Liu,et al.  Dexterous manipulation using both palm and fingers , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Siddhartha S. Srinivasa,et al.  Extrinsic dexterity: In-hand manipulation with external forces , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Aude Billard,et al.  Learning object-level impedance control for robust grasping and dexterous manipulation , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Danica Kragic,et al.  Learning of grasp adaptation through experience and tactile sensing , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[35]  Jan Peters,et al.  Learning robot in-hand manipulation with tactile features , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Sergey Levine,et al.  Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments , 2015, ArXiv.

[38]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[41]  Sergey Levine,et al.  Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[43]  Sergey Levine,et al.  Learning Dexterous Manipulation Policies from Experience and Imitation , 2016, ArXiv.

[44]  Danica Kragic,et al.  The GRASP Taxonomy of Human Grasp Types , 2016, IEEE Transactions on Human-Machine Systems.

[45]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[47]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[48]  Alberto Rodriguez,et al.  Sampling-based Planning of In-Hand Manipulation with External Pushes , 2017, ISRR.

[49]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[50]  Razvan Pascanu,et al.  Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[51]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[52]  Greg Turk,et al.  Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.

[53]  Abhinav Gupta,et al.  Robust Adversarial Reinforcement Learning , 2017, ICML.

[54]  Sergey Levine,et al.  Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[55]  Danica Kragic,et al.  Reinforcement Learning for Pivoting Task , 2017, ArXiv.

[56]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[57]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[58]  James Davidson,et al.  Supervision via competition: Robot adversaries for learning tasks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[59]  Jian Shi,et al.  Dynamic In-Hand Sliding Manipulation , 2017, IEEE Trans. Robotics.

[60]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[61]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[62]  Pietro Falco,et al.  On Policy Learning Robust to Irreversible Events: An Application to Robotic In-Hand Manipulation , 2018, IEEE Robotics and Automation Letters.

[63]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[64]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[65]  Wojciech Zaremba,et al.  Domain Randomization and Generative Models for Robotic Grasping , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[66]  Xinyu Liu,et al.  Dex-Net 3.0: Computing Robust Robot Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning , 2017, ArXiv.

[67]  Marcin Andrychowicz,et al.  Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.

[68]  Marcin Andrychowicz,et al.  Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.

[69]  Matthew W. Hoffman,et al.  Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.

[70]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[71]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[72]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.