论文信息 - Visual grasp affordances from appearance-based cues

Visual grasp affordances from appearance-based cues

In this paper, we investigate the prediction of visual grasp affordances from 2D measurements. Appearance-based estimation of grasp affordances is desirable when 3-D scans are unreliable due to clutter or material properties. We develop a general framework for estimating grasp affordances from 2-D sources, including local texture-like measures as well as object-category measures that capture previously learned grasp strategies. Local approaches to estimating grasp positions have been shown to be effective in real-world scenarios, but are unable to impart object-level biases and can be prone to false positives. We describe how global cues can be used to compute continuous pose estimates and corresponding grasp point locations, using a max-margin optimization for category-level continuous pose regression. We provide a novel dataset to evaluate visual grasp affordance estimation; on this dataset we show that a fused method outperforms either local or global methods alone, and that continuous pose estimation improves over discrete output models.

Trevor Darrell | Mario Fritz | Chunhui Gu | Hyun Oh Song

[1] P. Fua,et al. Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Xiaofeng Ren,et al. Discriminative Mixture-of-Templates for Viewpoint Classification , 2010, ECCV.

[3] Quoc V. Le,et al. Learning to grasp objects with multiple contact points , 2010, 2010 IEEE International Conference on Robotics and Automation.

[4] Ashutosh Saxena,et al. Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[5] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[6] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[7] Danica Kragic,et al. Learning grasping points with shape context , 2010, Robotics Auton. Syst..

[8] Michael R. Lowry,et al. Learning Physical Descriptions From Functional Definitions, Examples, and Precedents , 1983, AAAI.

[9] Lawson L. S. Wong,et al. Learning Grasp Strategies with Partial Shape Information , 2008, AAAI.

[10] Dmitry B. Goldgof,et al. Function-based recognition from incomplete knowledge of shape , 1993 .

[11] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Danica Kragic,et al. Grasping known objects with humanoid robots: A box-based approach , 2009, 2009 International Conference on Advanced Robotics.

[13] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15] Azriel Rosenfeld,et al. Recognition by Functional Parts , 1995, Comput. Vis. Image Underst..

[16] L. Stark,et al. Dissertation Abstract , 1994, Journal of Cognitive Education and Psychology.

[17] Matei T. Ciocarlie,et al. The Columbia grasp database , 2009, 2009 IEEE International Conference on Robotics and Automation.

[18] Silvio Savarese,et al. 3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19] J. J. Gibson. The theory of affordances , 1977 .