Object Learning through Interactive Manipulation and Foveated Vision

Autonomous robots that operate in unstructured environments must be able to seamlessly expand their knowledge base. To identify and manipulate previously unknown objects, a robot should continuously acquire new object knowledge even when no prior information about the objects or the environment is available. In this paper we propose to improve visual object learning and recognition by exploiting the advantages of foveated vision. The proposed approach first creates object hypotheses in peripheral stereo cameras. Next the robot directs its view towards one of the hypotheses to acquire images of the hypothetical object by foveal cameras. This enables a more thorough investigation of a smaller area of the scene, which is seen in higher resolution. Additional information that is needed to verify the hypothesis comes through interactive manipulation. A teacher or the robot itself induces a change in the scene by manipulating the hypothetical object. We compare two methods for validating the hypotheses in the foveal view and experimentally show the advantage of foveated vision compared to standard active stereo vision that relies on camera systems with a fixed field of view.

[1]  Ales Ude,et al.  Object segmentation and learning through feature grouping and manipulation , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[2]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  J. Andrew Bagnell,et al.  Interactive segmentation, tracking, and kinematic modeling of unknown 3D articulated objects , 2013, 2013 IEEE International Conference on Robotics and Automation.

[4]  Ales Ude,et al.  Active 3-D vision on a humanoid head , 2009, 2009 International Conference on Advanced Robotics.

[5]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[6]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[7]  Ales Ude,et al.  The Karlsruhe Humanoid Head , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[8]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[9]  Ales Ude,et al.  Redundancy Control of a Humanoid Head for Foveation and Three-Dimensional Object Tracking: A Virtual Mechanism Approach , 2010, Adv. Robotics.

[10]  J. Gibson The Ecological Approach to the Visual Perception of Pictures , 1978 .

[11]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[12]  Gert Kootstra,et al.  Learning and recognition of objects inspired by early cognition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Markus Vincze,et al.  Multimodal cue integration through Hypotheses Verification for RGB-D object recognition and 6DOF pose estimation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[14]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[15]  Stefan Schaal,et al.  Biomimetic Oculomotor Control , 2001, Adapt. Behav..

[16]  Wai Ho Li,et al.  Segmentation and modeling of visually symmetric objects by robot actions , 2011, Int. J. Robotics Res..

[17]  Paul M. Fitzpatrick,et al.  First contact: an active vision approach to segmentation , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[18]  Dieter Fox,et al.  Interactive singulation of objects from a pile , 2012, 2012 IEEE International Conference on Robotics and Automation.

[19]  Oliver Brock,et al.  Interactive segmentation for manipulation in unstructured environments , 2009, 2009 IEEE International Conference on Robotics and Automation.

[20]  J. Andrew Bagnell,et al.  Clearing a pile of unknown objects using interactive perception , 2013, 2013 IEEE International Conference on Robotics and Automation.

[21]  James M. Rehg,et al.  Guided pushing for object singulation , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  H. Kozima,et al.  A Robot that Learns to Communicate with Human Caregivers , 2001 .

[23]  Joshua G. Hale,et al.  Using Humanoid Robots to Study Human Behavior , 2000, IEEE Intell. Syst..

[24]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[25]  Vincent Rabaud,et al.  Pose estimation of rigid transparent objects in transparent clutter , 2013, 2013 IEEE International Conference on Robotics and Automation.

[26]  Gert Kootstra,et al.  Active exploration and keypoint clustering for object recognition , 2008, 2008 IEEE International Conference on Robotics and Automation.

[27]  Giorgio Metta,et al.  Grounding vision through experimental manipulation , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[28]  Jun Morimoto,et al.  Integrating surface-based hypotheses and manipulation for autonomous segmentation and learning of object representations , 2012, 2012 IEEE International Conference on Robotics and Automation.

[29]  Jun Morimoto,et al.  Integrating visual perception and manipulation for autonomous learning of object representations , 2013, Adapt. Behav..