论文信息 - Task-oriented grasping with semantic and geometric scene understanding

Task-oriented grasping with semantic and geometric scene understanding

We present a task-oriented grasp model, that encodes grasps that are configurationally compatible with a given task. For instance, if the task is to pour liquid from a container, the model encodes grasps that leave the opening of the container unobstructed. The model consists of two independent agents: First, a geometric grasp model that computes, from a depth image, a distribution of 6D grasp poses for which the shape of the gripper matches the shape of the underlying surface. The model relies on a dictionary of geometric object parts annotated with workable gripper poses and preshape parameters. It is learned from experience via kinesthetic teaching. The second agent is a CNN-based semantic model that identifies grasp-suitable regions in a depth image: regions where a grasp will not impede the execution of the task. The semantic model allows us to encode relationships such as “grasp from the handle.” A key element of this work is to use a deep network to integrate contextual task cues, and defer the structured-output problem of gripper pose computation to an explicit (learned) geometric model. Jointly, these two models generate grasps that are mechanically fit, and that grip on the object in a way that enables the intended task.

[1] E. Reed. The Ecological Approach to Visual Perception , 1989 .

[2] Michael A. Arbib,et al. Modeling parietal-premotor interactions in primate control of grasping , 1998, Neural Networks.

[3] Sarah H. Creem,et al. Grasping objects by their handles: a necessary interaction between cognition and action. , 2001, Journal of experimental psychology. Human perception and performance.

[4] G. Rizzolatti,et al. The Cortical Motor System , 2001, Neuron.

[5] Alexander Stoytchev,et al. Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[6] Maya Cakmak,et al. To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control , 2007, Adapt. Behav..

[7] Lawson L. S. Wong,et al. Learning Grasp Strategies with Partial Shape Information , 2008, AAAI.

[8] Ashutosh Saxena,et al. Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[9] M. Goodale,et al. Two visual systems re-viewed , 2008, Neuropsychologia.

[10] Justus H. Piater,et al. Continuous Surface-Point Distributions for 3D Object Pose Estimation and Recognition , 2010, ACCV.

[11] Danica Kragic,et al. Multivariate discretization for Bayesian Network structure learning in robot grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[12] Siddhartha S. Srinivasa,et al. Task Space Regions , 2011, Int. J. Robotics Res..

[13] Andreas Uhl,et al. BlenSor: Blender Sensor Simulation Toolbox , 2011, ISVC.

[14] Trevor Darrell,et al. Visual grasp affordances from appearance-based cues , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[15] Peter K. Allen,et al. Semantic grasping: Planning robotic grasps functionally suitable for an object manipulation task , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16] Markus Vincze,et al. AfRob: The affordance network ontology for robots , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] Oliver Kroemer,et al. A kernel-based approach to direct action perception , 2012, 2012 IEEE International Conference on Robotics and Automation.

[18] Danica Kragic,et al. Learning a dictionary of prototypical grasp-predicting parts from grasping experience , 2013, 2013 IEEE International Conference on Robotics and Automation.

[19] Jun Nakanishi,et al. Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[20] Luc De Raedt,et al. High-level Reasoning and Low-level Learning for Grasping: A Probabilistic Logic Pipeline , 2014, ArXiv.

[21] Jitendra Malik,et al. Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[22] Yiannis Aloimonos,et al. Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[23] Markus Schoeler,et al. Semantic Pose Using Deep Networks Trained on Synthetic RGB-D , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24] Trevor Darrell,et al. Fully convolutional networks for semantic segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26] Siddhartha S. Srinivasa,et al. The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[27] Danica Kragic,et al. Learning Human Priors for Task-Constrained Grasping , 2015, ICVS.

[28] Eren Erdal Aksoy,et al. Part-based grasp planning for familiar objects , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[29] Colas Schretter,et al. Monte Carlo and Quasi-Monte Carlo Methods , 2016 .

[30] Nikolaos G. Tsagarakis,et al. Detecting object affordances with Convolutional Neural Networks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[31] Juergen Gall,et al. Weakly Supervised Learning of Affordances , 2016, ArXiv.

[32] Sinisa Todorovic,et al. A Multi-scale CNN for Affordance Segmentation in RGB Images , 2016, ECCV.

[33] Ales Leonardis,et al. Task-relevant grasp selection: A joint solution to planning grasps and manipulative motion trajectories , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34] Mathieu Aubry,et al. Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[35] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[36] Roberto Cipolla,et al. MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving , 2016, 2018 IEEE Intelligent Vehicles Symposium (IV).