Self-Supervised Online Learning of Basic Object Push Affordances

Continuous learning of object affordances in a cognitive robot is a challenging problem, the solution to which arguably requires a developmental approach. In this paper, we describe scenarios where robotic systems interact with household objects by pushing them using robot arms while observing the scene with cameras, and which must incrementally learn, without external supervision, both the effect classes that emerge from these interactions as well as a discriminative model for predicting them from object properties. We formalize the scenario as a multi-view learning problem where data co-occur over two separate data views over time, and we present an online learning framework that uses a self-supervised form of learning vector quantization to build the discriminative model. In various experiments, we demonstrate the effectiveness of this approach in comparison with related supervised methods using data from experiments performed using two different robotic platforms.

[1]  Aleš Leonardis,et al.  A System for Learning Basic Object Affordances using a Self-Organizing Map , 2008 .

[2]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[3]  Giorgio Metta,et al.  Better Vision through Manipulation , 2003, Adapt. Behav..

[4]  Emre Ugur,et al.  Self-discovery of motor primitives and learning grasp affordances , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Matej Kristan,et al.  A hierarchical dynamic model for tracking in sports ∗ , 2007 .

[6]  Giorgio Metta,et al.  Early integration of vision and manipulation , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[7]  Atsushi Sato,et al.  Generalized Learning Vector Quantization , 1995, NIPS.

[8]  E. Reed The Ecological Approach to Visual Perception , 1989 .

[9]  S. Dudoit,et al.  A prediction-based resampling method for estimating the number of clusters in a dataset , 2002, Genome Biology.

[10]  Francisco Azuaje,et al.  Cluster validation techniques for genome expression data , 2003, Signal Process..

[11]  Danijel Skocaj,et al.  Self-supervised cross-modal online learning of basic object affordances for developmental robotic systems , 2010, 2010 IEEE International Conference on Robotics and Automation.

[12]  Virginia R. de Sa,et al.  Learning Classification with Unlabeled Data , 1993, NIPS.

[13]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[14]  Connor Schenck,et al.  Interactive object recognition using proprioceptive and auditory feedback , 2011, Int. J. Robotics Res..

[15]  Joshua M. Lewis,et al.  Multi-view kernel construction , 2010, Machine Learning.

[16]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[17]  Giulio Sandini,et al.  Learning about objects through action - initial steps towards artificial cognition , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[18]  Marek Sewer Kopicki,et al.  Prediction learning in robotic manipulation , 2010 .

[19]  R. Miikkulainen Dyslexic and Category-Specific Aphasic Impairments in a Self-Organizing Feature Map Model of the Lexicon , 1997, Brain and Language.

[20]  Emre Ugur,et al.  Goal emulation and planning in perceptual space using learned affordances , 2011, Robotics Auton. Syst..

[21]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[22]  Ales Ude,et al.  Action-grounded push affordance bootstrapping of unknown objects , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Michael H. Coen Self-Supervised Acquisition of Vowels in American English , 2006, AAAI.

[24]  Aleš Leonardis,et al.  Relevance Determination for Learning Vector Quantization using the Fisher Criterion Score , 2012 .

[25]  Thomas Villmann,et al.  Generalized relevance learning vector quantization , 2002, Neural Networks.

[26]  Matej Kristan,et al.  Closed-world tracking of multiple interacting targets for indoor-sports applications , 2009, Comput. Vis. Image Underst..

[27]  Michael H. Coen,et al.  Cross-Modal Clustering , 2005, AAAI.

[28]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[29]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[30]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[31]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[32]  Thomas Villmann,et al.  Supervised Neural Gas with General Similarity Measure , 2005, Neural Processing Letters.

[33]  Lola Cañamero,et al.  Using a SOFM to learn Object Affordances , 2004 .

[34]  M. Dogar,et al.  Afford or Not to Afford : A New Formalization of Affordances Toward Affordance-Based Robot , 2007 .

[35]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[36]  Oliver Kroemer,et al.  Learning grasp affordance densities , 2011, Paladyn J. Behav. Robotics.