Graspable Object Classification with Multi-loss Hierarchical Representations

To allow robots to accomplish manipulation work effectively, one of the critical functions they need is to precisely and robustly recognize the robotic graspable object and the category of the graspable objects, especially in data limited condition. In this paper, we propose a novel multi-loss hierarchical representations learning framework that is capable of recognizing the category of graspable objects in a coarse-to-fine way. Our model consists of two main components, an efficient hierarchical feature learning component that combines kernel features with the deep learning features and a multi-loss function that optimizes the multi-task learning mechanism in a coarse-to-fine way. We demonstrate the power of our proposed system to data of graspable and ungraspable objects. The results show that our system has superior performance than many existing algorithms both in terms of classification accuracy and computation efficiency. Moreover, our system achieves a quite high accuracy (about 82%) in unstructured real-world condition.

[1]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[2]  Lei Zhang,et al.  Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[4]  Dieter Fox,et al.  Object recognition with hierarchical kernel descriptors , 2011, CVPR 2011.

[5]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[6]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[7]  Dieter Fox,et al.  Kernel Descriptors for Visual Recognition , 2010, NIPS.

[8]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[9]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[11]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[13]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Hong Liu,et al.  Robot grasp detection using multimodal deep convolutional neural networks , 2016 .

[16]  Sergey Levine,et al.  Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[17]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.