Task-oriented Grasping in Object Stacking Scenes with CRF-based Semantic Model

In task-oriented grasping, the robot is supposed to manipulate the objects in a task-compatible manner, which is more important but more challenging than just stably grasping. However, most of existing works perform task-oriented grasping only in single object scenes. This greatly limits their practical application in real world scenes, in which there are usually multiple stacked objects with serious overlaps and occlusions. To perform task-oriented grasping in object stacking scenes, in this paper, we firstly build a synthetic dataset named Object Stacking Grasping Dataset (OSGD) for task-oriented grasping in object stacking scenes. Secondly, a Conditional Random Field (CRF) is constructed to model the semantic contents in object regions. The modelled semantic contents can be illustrated as incompatibility of task labels and continuity of task regions. This proposed approach can greatly reduce the interference of overlaps and occlusions in object stacking scenes. To embed the CRF-based semantic model into our grasp detection network, we implement the inference process of CRFs as a RNN so that the whole model, Task-oriented Grasping CRFs (TOG-CRFs) can be trained end to end. Finally, in object stacking scenes, the constructed model can help robot achieve 69.4% success rate for task-oriented grasping.

[1]  Yu Sun,et al.  Task-Oriented Grasp Planning Based on Disturbance Distribution , 2013, ISRR.

[2]  Roberto Cipolla,et al.  Convolutional CRFs for Semantic Segmentation , 2018, BMVC.

[3]  Luc De Raedt,et al.  High-level Reasoning and Low-level Learning for Grasping: A Probabilistic Logic Pipeline , 2014, ArXiv.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Danica Kragic,et al.  Classical grasp quality evaluation: New algorithms and theory , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Danica Kragic,et al.  Learning task constraints for robot grasping using graphical models , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Yang Zhang,et al.  Fully Convolutional Grasp Detection Network with Oriented Anchor Box , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Rüdiger Dillmann,et al.  The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics , 2012, Int. J. Robotics Res..

[9]  Matei T. Ciocarlie,et al.  The Columbia grasp database , 2009, 2009 IEEE International Conference on Robotics and Automation.

[10]  Silvio Savarese,et al.  Learning task-oriented grasping for tool manipulation from simulated self-supervision , 2018, Robotics: Science and Systems.

[11]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[12]  Larry H. Matthies,et al.  Task-oriented grasping with semantic and geometric scene understanding , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[15]  刘 子熠,et al.  Hybrid-augmented intelligence: collaboration and cognition , 2017 .

[16]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[17]  Helge J. Ritter,et al.  Task-oriented quality measures for dextrous grasping , 2005, 2005 International Symposium on Computational Intelligence in Robotics and Automation.

[18]  Danica Kragic,et al.  Global Search with Bernoulli Alternation Kernel for Task-oriented Grasping Informed by Simulation , 2018, CoRL.

[19]  Danica Kragic,et al.  Task-Based Robot Grasp Planning Using Probabilistic Inference , 2015, IEEE Transactions on Robotics.

[20]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[25]  Stefano Caselli,et al.  Interactive teaching of task-oriented robot grasps , 2010, Robotics Auton. Syst..

[26]  Yu Sun,et al.  Task-based grasp quality measures for grasp synthesis , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[28]  Danica Kragic,et al.  Affordance detection for task-specific grasping using deep learning , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[29]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Nanning Zheng,et al.  Hybrid-augmented intelligence: collaboration and cognition , 2017, Frontiers of Information Technology & Electronic Engineering.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.