Generative Attention Learning: a “GenerAL” framework for high-performance multi-fingered grasping in clutter

Generative Attention Learning (GenerAL) is a framework for high-DOF multi-fingered grasping that is not only robust to dense clutter and novel objects but also effective with a variety of different parallel-jaw and multi-fingered robot hands. This framework introduces a novel attention mechanism that substantially improves the grasp success rate in clutter. Its generative nature allows the learning of full-DOF grasps with flexible end-effector positions and orientations, as well as all finger joint angles of the hand. Trained purely in simulation, this framework skillfully closes the sim-to-real gap. To close the visual sim-to-real gap, this framework uses a single depth image as input. To close the dynamics sim-to-real gap, this framework circumvents continuous motor control with a direct mapping from pixel to Cartesian space inferred from the same depth image. Finally, this framework demonstrates inter-robot generality by achieving over $$92\%$$ real-world grasp success rates in cluttered scenes with novel objects using two multi-fingered robotic hand-arm systems with different degrees of freedom.

[1]  Wojciech Zaremba,et al.  Domain Randomization and Generative Models for Robotic Grasping , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Josep M. Porta,et al.  Global Optimization of Robotic Grasps , 2011, Robotics: Science and Systems.

[3]  Anis Sahbani,et al.  Dexterous manipulation planning using probabilistic roadmaps in continuous grasp subspaces , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Sergey Levine,et al.  Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[5]  Peter K. Allen,et al.  Workspace Aware Online Grasp Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Peter Corke,et al.  Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach , 2018, Robotics: Science and Systems.

[7]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[8]  Peter K. Allen,et al.  Generating multi-fingered robotic grasps via deep learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[10]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[11]  Rui Chen,et al.  GRIP: Generative Robust Inference and Perception for Semantic Robot Manipulation in Adversarial Environments , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Peter K. Allen,et al.  Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scenes , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Gregory Palmer,et al.  Fully Convolutional One-Shot Object Segmentation for Industrial Robotics , 2019, AAMAS.

[14]  Mirko Wächter,et al.  Grasping of Unknown Objects Using Deep Convolutional Neural Networks Based on Depth Images , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[17]  Kate Saenko,et al.  Learning a visuomotor controller for real world robotic grasping using simulated depth images , 2017, CoRL.

[18]  Tucker Hermans,et al.  Planning Multi-Fingered Grasps as Probabilistic Inference in a Learned Deep Network , 2018, ISRR.

[19]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[20]  Danica Kragic,et al.  Hierarchical Fingertip Space for multi-fingered precision grasping , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Allison M. Okamura,et al.  An overview of dexterous manipulation , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[22]  Oliver Kroemer,et al.  Towards Precise Robotic Grasping by Probabilistic Post-grasp Displacement Estimation , 2019, FSR.

[23]  Matei T. Ciocarlie,et al.  Hand Posture Subspaces for Dexterous Robotic Grasping , 2009, Int. J. Robotics Res..

[24]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Sergey Levine,et al.  Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[27]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[29]  David Watkins-Valls,et al.  Multi-Modal Geometric Learning for Grasping and Manipulation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[31]  Siddhartha S. Srinivasa,et al.  Grasp synthesis in cluttered environments for dexterous hands , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[32]  Ville Kyrki,et al.  Robust Grasp Planning Over Uncertain Shape Completions , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  P. Allen,et al.  Dexterous Grasping via Eigengrasps : A Low-dimensional Approach to a High-complexity Problem , 2007 .

[34]  Alberto Rodriguez,et al.  Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[36]  Markus Vincze,et al.  Learning grasps for unknown objects in cluttered scenes , 2013, 2013 IEEE International Conference on Robotics and Automation.

[37]  Henrik I. Christensen,et al.  Automatic grasp planning using shape primitives , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[38]  Jie Zhao,et al.  Efficient Fully Convolution Neural Network for Generating Pixel Wise Robotic Grasps With High Resolution Images , 2019, 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[39]  Bohan Wu,et al.  MAT: Multi-Fingered Adaptive Tactile Grasping via Deep Reinforcement Learning , 2019, CoRL.

[40]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[41]  Dmitry Berenson,et al.  Grasp planning in complex scenes , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[42]  Odest Chadwicke Jenkins,et al.  Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Haibin Ling,et al.  A Deep Network Solution for Attention and Aesthetics Aware Photo Cropping , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Chad DeChant,et al.  Shape completion enabled robotic grasping , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[45]  Robert Platt,et al.  Learning 6-DoF Grasping and Pick-Place Using Attention Focus , 2018, CoRL.

[46]  Tomás Lozano-Pérez,et al.  Imitation Learning of Whole-Body Grasps , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[48]  Noel E. O'Connor,et al.  Shallow and Deep Convolutional Networks for Saliency Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).