Fully Convolutional Grasp Detection Network with Oriented Anchor Box

In this paper, we present a real-time approach to predict multiple grasping poses for a parallel-plate robotic gripper using RGB images. A model with oriented anchor box mechanism is proposed and a new matching strategy is used during the training process. An end-to-end fully convolutional neural network is employed in our work. The network consists of two parts: the feature extractor and multi-grasp predictor. The feature extractor is a deep convolutional neural network. The multi-grasp predictor regresses grasp rectangles from predefined oriented rectangles, called oriented anchor boxes, and classifies the rectangles into graspable and ungraspable. On the standard Cornell Grasp Dataset, our model achieves an accuracy of 97.74% and 96.61% on image-wise split and object-wise split respectively, and outperforms the latest state-of-the-art approach by 1.74% on image-wise split and 0.51% on object-wise split.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Peter K. Allen,et al.  An SVM learning approach to robotic grasping , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[5]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[6]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[9]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[10]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[11]  Di Guo,et al.  Deep vision networks for real-time robotic grasp detection , 2016 .

[12]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[13]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[14]  Christopher Kanan,et al.  Robotic grasp detection using deep convolutional neural networks , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Di Guo,et al.  A hybrid deep architecture for robotic grasp detection , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19]  Patricio A. Vela,et al.  Deep Grasp: Detection and Localization of Grasps with Deep Neural Networks , 2018, ArXiv.

[20]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.