GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping

Object grasping is critical for many applications, which is also a challenging computer vision problem. However, for cluttered scene, current researches suffer from the problems of insufficient training data and the lacking of evaluation benchmarks. In this work, we contribute a large-scale grasp pose detection dataset with a unified evaluation system. Our dataset contains 97,280 RGB-D image with over one billion grasp poses. Meanwhile, our evaluation system directly reports whether a grasping is successful by analytic computation, which is able to evaluate any kind of grasp poses without exhaustively labeling ground-truth. In addition, we propose an end-to-end grasp pose prediction network given point cloud inputs, where we learn approaching direction and operation parameters in a decoupled manner. A novel grasp affinity field is also designed to improve the grasping robustness. We conduct extensive experiments to show that our dataset and evaluation system can align well with real-world experiments and our proposed network achieves the state-of-the-art performance. Our dataset, source code and models are publicly available at www.graspnet.net.

[1]  Yifan Xu,et al.  SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters , 2018, ECCV.

[2]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[3]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[4]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Timothy Bretl,et al.  Self-supervised 6D Object Pose Estimation for Robot Manipulation , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Vladlen Koltun,et al.  Tangent Convolutions for Dense Prediction in 3D , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Jianbin Tang,et al.  GraspNet: An Efficient Convolutional Neural Network for Real-time Grasp Detection for Low-powered Devices , 2018, IJCAI.

[9]  Peter Corke,et al.  Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach , 2018, Robotics: Science and Systems.

[10]  Yaron Lipman,et al.  Point convolutional neural networks by extension operators , 2018, ACM Trans. Graph..

[11]  Leonidas J. Guibas,et al.  Deep Hough Voting for 3D Object Detection in Point Clouds , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[14]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[15]  Vincent Lepetit,et al.  Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes , 2012, ACCV.

[16]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Patricio A. Vela,et al.  Real-World Multiobject, Multigrasp Detection , 2018, IEEE Robotics and Automation Letters.

[18]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[19]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Peter K. Allen,et al.  Generating multi-fingered robotic grasps via deep learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Cewu Lu,et al.  PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation , 2018, ArXiv.

[22]  Dieter Fox,et al.  6-DOF GraspNet: Variational Grasp Generation for Object Manipulation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Sainan Liu,et al.  Attentional ShapeContextNet for Point Cloud Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[26]  Xiaogang Wang,et al.  PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[28]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Ye Duan,et al.  PointGrid: A Deep Network for 3D Shape Understanding , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Van-Duc Nguyen,et al.  Constructing Force- Closure Grasps , 1988, Int. J. Robotics Res..

[31]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[32]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Yin Zhou,et al.  VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Dieter Fox,et al.  PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes , 2017, Robotics: Science and Systems.

[36]  P. Abbeel,et al.  Yale-CMU-Berkeley dataset for robotic manipulation research , 2017, Int. J. Robotics Res..

[37]  Silvio Savarese,et al.  Learning task-oriented grasping for tool manipulation from simulated self-supervision , 2018, Robotics: Science and Systems.

[38]  Dieter Fox,et al.  ContactGrasp: Functional Multi-finger Grasp Synthesis from Contact , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[39]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[40]  Yehoshua Y. Zeevi,et al.  The farthest point strategy for progressive image sampling , 1997, IEEE Trans. Image Process..

[41]  Di Guo,et al.  A hybrid deep architecture for robotic grasp detection , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Laurens van der Maaten,et al.  3D Semantic Segmentation with Submanifold Sparse Convolutional Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Nassir Navab,et al.  SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  Emmanuel Dellandréa,et al.  Jacquard: A Large Scale Dataset for Robotic Grasp Detection , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[45]  Jianbin Tang,et al.  EnsembleNet: Improving Grasp Detection using an Ensemble of Convolutional Neural Networks , 2018, BMVC.

[46]  Zhiqiang Tian,et al.  ROI-based Robotic Grasp Detection for Object Overlapping Scenes , 2018, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[47]  Arkanath Pathak,et al.  Learning 6-DOF Grasping Interaction via Deep Geometry-Aware 3D Representations , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[49]  Douglas Chai,et al.  Review of Deep Learning Methods in Robotic Grasp Detection , 2018, Multimodal Technol. Interact..

[50]  Hao Su,et al.  S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scenes , 2019, CoRL.

[51]  Fuchun Sun,et al.  PointNetGPD: Detecting Grasp Configurations from Point Sets , 2018, 2019 International Conference on Robotics and Automation (ICRA).