Affordance Prediction via Learned Object Attributes

We present a novel method for learning and predicting the affordances of an object based on its physical and visual attributes. Affordance prediction is a key task in autonomous robot learning, as it allows a robot to reason about the actions it can perform in order to accomplish its goals. Previous approaches to affordance prediction have either learned direct mappings from visual features to affordances, or have introduced object categories as an intermediate representation. In this paper, we argue that physical and visual attributes provide a more appropriate mid-level representation for affordance prediction, because they support informationsharing between affordances and objects, resulting in superior generalization performance. In particular, affordances are more likely to be correlated with the attributes of an object than they are with its visual appearance or a linguistically-derived object category. We provide preliminary validation of our method experimentally, and present empirical comparisons to both the direct and category-based approaches of affordance prediction. Our encouraging results suggest the promise of the attributebased approach to affordance prediction.

[1]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[2]  R. Shaw,et al.  Perceiving, Acting and Knowing : Toward an Ecological Psychology , 1978 .

[3]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[4]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[5]  Matthew T. Mason,et al.  Mechanics and Planning of Manipulator Pushing Operations , 1986 .

[6]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[7]  Shree K. Nayar,et al.  Reflectance and texture of real-world surfaces , 1999, TOGS.

[8]  G. Aschersleben,et al.  The Theory of Event Coding (TEC): a framework for perception and action planning. , 2001, The Behavioral and brain sciences.

[9]  Giulio Sandini,et al.  Learning about objects through action - initial steps towards artificial cognition , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[10]  Giorgio Metta,et al.  Early integration of vision and manipulation , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[11]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[12]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[13]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[14]  Lucas Paletta,et al.  Learning Predictive Features in Affordance based Robotic Perception Systems , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  James M. Rehg,et al.  Traversability classification using unsupervised on-line visual learning for outdoor robot navigation , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[16]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[17]  Christopher W. Geib,et al.  Object Action Complexes as an Interface for Planning and Robot Control , 2006 .

[18]  Patrick Doherty,et al.  06231 Executive Summary - Towards Affordance-based Robot Control , 2006, Towards Affordance-Based Robot Control.

[19]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[20]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  三嶋 博之 The theory of affordances , 2008 .

[22]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[23]  Bernt Schiele,et al.  Functional Object Class Detection Based on Learned Affordance Cues , 2008, ICVS.

[24]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[25]  Andrew Zisserman,et al.  Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection , 2008, International Journal of Computer Vision.

[26]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Ali Farhadi,et al.  Unlabeled Data Improves Word Prediction , 2009 .

[28]  Gang Wang,et al.  Joint learning of visual attributes, object classes and visual saliency , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Mike Stilman,et al.  Combining motion planning and optimization for flexible robot manipulation , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[32]  Ali Farhadi,et al.  Attribute-centric recognition for cross-category generalization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  James M. Rehg,et al.  Learning Visual Object Categories for Robot Affordance Prediction , 2010, Int. J. Robotics Res..

[34]  Quoc V. Le,et al.  Learning to grasp objects with multiple contact points , 2010, 2010 IEEE International Conference on Robotics and Automation.