Towards Bootstrap Learning for Object Discovery ∗

We show how a robot can autonomously learn an ontology of objects to explain aspects of its sensor input from an unknown dynamic world. Unsupervised learning about objects is an important conceptual step in developmental learning, whereby the agent clusters observations across space and time to construct stable perceptual representations of objects. Our proposed unsupervised learning method uses the properties of allocentric occupancy grids to classify individual sensor readings as static or dynamic. Dynamic readings are clustered and the clusters are tracked over time to identify objects, separating them both from the background of the environment and from the noise of unexplainable sensor readings. Once trackable clusters of sensor readings (i.e., objects) have been identified, we build shape models where they are stable and consistent properties of these objects. However, the representation can tolerate, represent, and track amorphous objects as well as those that have well-defined shape. In the end, the learned ontology makes it possible for the robot to describe a cluttered dynamic world with symbolic object descriptions along with a static environment model, both models grounded in sensory experience, and learned without external supervi-

[1]  Hans P. Moravec Sensor Fusion in Certainty Grids for Mobile Robots , 1988, AI Mag..

[2]  Elizabeth S. Spelke,et al.  Principles of Object Perception , 1990, Cogn. Sci..

[3]  W. S. Gribble Slow visual search in a fast-changing world , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[4]  Benjamin Kuipers,et al.  Map Learning with Uninterpreted Sensors and Effectors , 1995, Artif. Intell..

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Shimon Edelman,et al.  Representation and recognition in vision , 1999 .

[7]  Wolfram Burgard,et al.  Monte Carlo Localization with Mixture Proposal Distribution , 2000, AAAI/IAAI.

[8]  Takeo Kanade,et al.  A System for Video Surveillance and Monitoring , 2000 .

[9]  Wolfram Burgard,et al.  Probabilistic state estimation of dynamic objects with a moving mobile robot , 2001, Robotics Auton. Syst..

[10]  Alessandro Saffiotti,et al.  Perceptual Anchoring of Symbols for Action , 2001, IJCAI.

[11]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[12]  Benjamin Kuipers,et al.  Bootstrap learning for place recognition , 2002, AAAI/IAAI.

[13]  Scott Sanner,et al.  Towards object mapping in non-stationary environments with mobile robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Sebastian Thrun,et al.  Learning Hierarchical Object Maps of Non-Stationary Environments with Mobile Robots , 2002, UAI.

[15]  Sebastian Thrun,et al.  Online simultaneous localization and mapping with detection and tracking of moving objects: theory and results from a ground vehicle in crowded urban areas , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[16]  S. Thrun,et al.  An Extension of the ICP Algorithm for Modeling Nonrigid Objects with Mobile Robots , 2003, IJCAI.

[17]  Wilson S. Geisler,et al.  A Bayesian approach to the evolution of perceptual and cognitive systems , 2003, Cogn. Sci..

[18]  Ronald Parr,et al.  DP-SLAM: Fast, Robust Simultaneous Localization and Mapping Without Predetermined Landmarks , 2003, IJCAI.

[19]  Leslie Pack Kaelbling,et al.  Learning object segmentation from video data , 2003 .

[20]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[22]  Sebastian Thrun,et al.  Recovering Articulated Object Models from 3D Range Data , 2004, UAI.

[23]  Chen Yu,et al.  The Role of Embodied Intention in Early Lexical Acquisition , 2005, Cogn. Sci..