Predicting Multiple Pregrasping Poses by Combining Deep Convolutional Neural Networks with Mixture Density Networks

In this paper, we propose a deep neural network to predict the pregrasp poses of a three-dimensional (3D) object. Specifically, a single RGB-D image is used to determine multiple pregrasp position of three fingers of the robotic hand for various poses of known or unknown objects. Multiple pregrasping pose prediction typically involves the use of complex multi-valued functions where standard regression models fail. To this end, we propose a deep neural network containing a variant of the traditional deep convolutional neural network as well as a mixture density network. Furthermore, in order to overcome the difficulty of learning with insufficient data in the first part of the proposed network, we develop a supervised learning technique to pretrain the variant of the convolutional neural network.

[1]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[2]  Siddhartha S. Srinivasa,et al.  A data-driven statistical framework for post-grasp manipulation , 2014, Int. J. Robotics Res..

[3]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[4]  Anis Sahbani,et al.  An overview of 3D object grasp synthesis algorithms , 2012, Robotics Auton. Syst..

[5]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[10]  N. Kruger,et al.  Learning object-specific grasp affordance densities , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[11]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[12]  C. Bishop Mixture density networks , 1994 .

[13]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[14]  Sergey Levine,et al.  Learning compound multi-step controllers under unknown dynamics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).