Robotic Grasping of Novel Objects from RGB-D Images by Using Multi-Level Convolutional Neural Networks

Inspired by human grasping, we propose an efficient robotic grasping detection method to get the optimal grasping rectangle from RGB-D images by developing multi-level Convolutional Neural Networks (CNNs). The multi-level CNNs consist of three levels with different structures and functions. The first level is designed to locate the grasped object roughly, and then determine the searching range of the grasping rectangles. The second level is designed to obtain the preselected grasping rectangles, rapidly find out usable ones and eliminate unusable ones by a small-scale CNN. The third level is designed to reevaluate the preselected grasping rectangles and catch much more features with a large-scale CNN, so as to evaluate every single preselected grasping rectangle accurately. A selection algorithm is presented further to obtain the optimal grasping rectangle. Tests on a large-scale grasping dataset validate that, the proposed robotic grasping detection method can significantly improve the accuracy of the grasping rectangle, compared with the existing methods based on CNNs. Grasping experiments are performed on a 5-DOF Youbot arm, and the experimental results indicate that the proposed method can find out the optimal grasping rectangle and successfully achieve novel object grasping.

[1]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[2]  Sven Behnke,et al.  RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[5]  Pieter Abbeel,et al.  Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding , 2010, 2010 IEEE International Conference on Robotics and Automation.

[6]  Stefan Leutenegger,et al.  Deep learning a grasp function for grasping under gripper pose uncertainty , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Stephen Granade,et al.  Autonomous rendezvous and docking sensor suite , 2003, SPIE Defense + Commercial Sensing.

[9]  Danica Kragic,et al.  Hierarchical Fingertip Space for multi-fingered precision grasping , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yi Li,et al.  Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web , 2015, AAAI.

[12]  Stefan Ulbrich,et al.  OpenGRASP: A Toolkit for Robot Grasping Simulation , 2010, SIMPAR.

[13]  Peter K. Allen,et al.  Pose error robust grasping from contact wrench space metrics , 2012, 2012 IEEE International Conference on Robotics and Automation.

[14]  Yuting Zhang,et al.  Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Belhassen-Chedli Bouzgarrou,et al.  Model-based strategy for grasping 3D deformable objects using a multi-fingered robotic hand , 2017, Robotics Auton. Syst..

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Sergey Levine,et al.  Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[22]  Nikolaos G. Tsagarakis,et al.  Detecting object affordances with Convolutional Neural Networks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).