Real-world Multi-object, Multi-grasp Detection

A deep learning architecture is proposed to predict graspable locations for robotic manipulation. It considers situations where no, one, or multiple object(s) are seen. By defining the learning problem to be classification with null hypothesis competition instead of regression, the deep neural network with RGB-D image input predicts multiple grasp candidates for a single object or multiple objects, in a single shot. The method outperforms state-of-the-art approaches on the Cornell dataset with 96.0% and 96.1% accuracy on image-wise and object- wise splits, respectively. Evaluation on a multi-object dataset illustrates the generalization capability of the architecture. Grasping experiments achieve 96.0% grasp localization and 88.0% grasping success rates on a test set of household objects. The real-time process takes less than .25 s from image to plan.

[1]  Mohammed Bennamoun,et al.  RGB-D Object Recognition and Grasp Detection Using Hierarchical Cascaded Forests , 2017, IEEE Transactions on Robotics.

[2]  Quoc V. Le,et al.  Grasping novel objects with depth segmentation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Danica Kragic,et al.  Selection of robot pre-grasps using box-based shape approximation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Vijay Kumar,et al.  Robotic grasping and contact: a review , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[5]  Karun B. Shimoga,et al.  Robot Grasp Synthesis Algorithms: A Survey , 1996, Int. J. Robotics Res..

[6]  Tucker Hermans,et al.  Planning Multi-Fingered Grasps as Probabilistic Inference in a Learned Deep Network , 2018, ISRR.

[7]  Fumiya Iida,et al.  Real-World, Real-Time Robotic Grasping with Convolutional Neural Networks , 2017, TAROS.

[8]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[9]  Dieter Fox,et al.  Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[10]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Anis Sahbani,et al.  An overview of 3D object grasp synthesis algorithms , 2012, Robotics Auton. Syst..

[12]  Hong Liu,et al.  Robot grasp detection using multimodal deep convolutional neural networks , 2016 .

[13]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[14]  Christopher Kanan,et al.  Robotic grasp detection using deep convolutional neural networks , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[17]  Quoc V. Le,et al.  Learning to grasp objects with multiple contact points , 2010, 2010 IEEE International Conference on Robotics and Automation.

[18]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Di Guo,et al.  A hybrid deep architecture for robotic grasp detection , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[22]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Shimon Edelman,et al.  Learning to grasp using visual information , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[24]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[25]  Danica Kragic,et al.  Learning and Evaluation of the Approach Vector for Automatic Grasp Generation and Planning , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[26]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[28]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[30]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[31]  Stefan Leutenegger,et al.  Deep learning a grasp function for grasping under gripper pose uncertainty , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).