Deep vision networks for real-time robotic grasp detection

Grasping has always been a great challenge for robots due to its lack of the ability to well understand the perceived sensing data. In this work, we propose an end-to-end deep vision network model to predict possible good grasps from real-world images in real time. In order to accelerate the speed of the grasp detection, reference rectangles are designed to suggest potential grasp locations and then refined to indicate robotic grasps in the image. With the proposed model, the graspable scores for each location in the image and the corresponding predicted grasp rectangles can be obtained in real time at a rate of 80 frames per second on a graphic processing unit. The model is evaluated on a real robot-collected data set and different reference rectangle settings are compared to yield the best detection performance. The experimental results demonstrate that the proposed approach can assist the robot to learn the graspable part of the object from the image in a fast manner.

[1]  Quoc V. Le,et al.  Learning to grasp objects with multiple contact points , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[3]  Oliver Kroemer,et al.  Learning robot grasping from 3-D images with Markov Random Fields , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[5]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Danica Kragic,et al.  Learning grasping points with shape context , 2010, Robotics Auton. Syst..

[7]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[8]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[9]  Peter K. Allen,et al.  An SVM learning approach to robotic grasping , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[10]  Brahim Chaib-draa,et al.  Dictionary Learning for Robotic Grasp Recognition and Detection , 2016, ArXiv.

[11]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[13]  Anis Sahbani,et al.  Learning the natural grasping component of an unknown object , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Alexander Herzog,et al.  Learning of grasp selection based on shape-templates , 2014, Auton. Robots.

[16]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[17]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[18]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Matei T. Ciocarlie,et al.  The Columbia grasp database , 2009, 2009 IEEE International Conference on Robotics and Automation.

[20]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[21]  Di Guo,et al.  Object discovery and grasp detection with a shared convolutional neural network , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.