GraspFusionNet: a two-stage multi-parameter grasp detection network based on RGB–XYZ fusion in dense clutter

Robotic grasping of diverse range of novel objects is a great challenge in dense clutter, which is also critical to many applications. However, current methods are vulnerable to perception uncertainty for dense stacked objects, resulting in limited accuracy of multi-parameter grasp prediction. In this paper, we propose a two-stage grasp detection pipeline including sampling and predicting stages. The first sampling stage applies fully convolutional network to generate grasp proposal regions, which contain potential graspable objects. Among grasp proposal region, the second prediction stage predicts complete grasp parameters based on fusion of RGB–XYZ heightmaps, which are converted from color and depth images. To perceive essential structures of stable grasping, 2D CNN and 3D CNN are used to learn color and geometric features to predict multi-parameter grasp, respectively. The direct mapping from heightmaps to grasp parameters is realized based on a multi-task loss. Experiments on a self-built dataset and an open dataset are conducted to analyze the network performance. The results indicate that the proposed two-stage method achieves the best performance among other grasp detection algorithms. Robotic experiments demonstrate generalization ability and robustness in dense clutter for novel objects, and the proposed method achieves average grasp success rate of 82.4%, which is also better than other state-of-the-art methods. Our self-built dataset and robotic grasping video are available at https://github.com/liuwenhai/toteGrasping.git.

[1]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Patricio A. Vela,et al.  Real-World Multiobject, Multigrasp Detection , 2018, IEEE Robotics and Automation Letters.

[3]  Rama Chellappa,et al.  Fast object localization and pose estimation in heavy clutter for robotic bin picking , 2012, Int. J. Robotics Res..

[4]  C. Qi Deep Learning on Point Sets for 3 D Classification and Segmentation , 2016 .

[5]  Oliver Brock,et al.  Four aspects of building robotic systems: lessons from the Amazon Picking Challenge 2015 , 2018, Auton. Robots.

[6]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Martijn Wisse,et al.  Integrating Different Levels of Automation: Lessons From Winning the Amazon Robotics Challenge 2016 , 2018, IEEE Transactions on Industrial Informatics.

[10]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Christopher Kanan,et al.  Robotic grasp detection using deep convolutional neural networks , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Ken Goldberg,et al.  Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[13]  Mohammed Bennamoun,et al.  RGB-D Object Recognition and Grasp Detection Using Hierarchical Cascaded Forests , 2017, IEEE Transactions on Robotics.

[14]  Lorenzo Torresani,et al.  C3D: Generic Features for Video Analysis , 2014, ArXiv.

[15]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[16]  Fatma Göngör,et al.  Design and Implementation of a Facial Character Analysis Algorithm for Humanoid Robots , 2019, Robotica.

[17]  Xinyu Liu,et al.  Dex-Net 3.0: Computing Robust Robot Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning , 2017, ArXiv.

[18]  Wolfram Burgard,et al.  Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[20]  Karim Elmaaroufi,et al.  Improved deep learning-based macromolecules structure classification from electron cryo-tomograms , 2017, Machine Vision and Applications.

[21]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[22]  Matei T. Ciocarlie,et al.  Towards Reliable Grasping and Manipulation in Household Environments , 2010, ISER.

[23]  Douglas Chai,et al.  Review of Deep Learning Methods in Robotic Grasp Detection , 2018, Multimodal Technol. Interact..

[24]  Kate Saenko,et al.  High precision grasp pose detection in dense clutter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Nanning Zheng,et al.  A Real-time Robotic Grasp Approach with Oriented Anchor Box , 2018, ArXiv.

[26]  Peter I. Corke,et al.  Cartman: The Low-Cost Cartesian Manipulator that Won the Amazon Robotics Challenge , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[29]  Silvio Savarese,et al.  DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Sven Behnke,et al.  NimbRo picking: Versatile part handling for warehouse automation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Peter I. Corke,et al.  Semantic Segmentation from Limited Training Data , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Sven Behnke,et al.  RGB-D object detection and semantic segmentation for autonomous manipulation in clutter , 2018, Int. J. Robotics Res..

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Danfei Xu,et al.  PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Sven Behnke,et al.  Fast Object Learning and Dual-arm Coordination for Cluttered Stowing, Picking, and Packing , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[38]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[39]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[40]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[41]  Jin Ma,et al.  Deep learning for picking point detection in dense cluster , 2017, 2017 11th Asian Control Conference (ASCC).

[42]  Fuchun Sun,et al.  PointNetGPD: Detecting Grasp Configurations from Point Sets , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[43]  Peter Corke,et al.  Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach , 2018, Robotics: Science and Systems.

[44]  Jürgen Leitner,et al.  Learning robust, real-time, reactive robotic grasping , 2019, Int. J. Robotics Res..