Unsupervised discovery of object classes with a mobile robot

Object detection and recognition are fundamental capabilities for a mobile robot. Objects are a powerful representation for a variety of tasks including mobile manipulation and inventory tracking. As a result, object-based world representations have seen a great deal of research interest in the last several years. However, these systems usually assume that object recognition is well-solved: they require that accurate recognition be available for every object they might encounter. Despite steady advances, object recognition remains a difficult, open problem. Existing object recognition algorithms rely on high-resolution three-dimensional object models or on extensive hand-labeled training data. The sheer variety of objects that occur in natural environments makes manually training a recognizer for every possible object infeasible. In this work, we present a robotic system for unsupervised object and class discovery, in which objects are first discovered, and then grouped into classes in an unsupervised fashion. At each step, we approach the problem as one of robotics, not disembodied computer vision. On a very large robotic dataset, we discover object classes with 98.7% precision while achieving 71.8% recall. The scale and quality of these results demonstrate the merit of our approach, and prove the practicality of long-term large-scale object discovery. To our knowledge, no other authors have investigated robotic object discovery at this scale, making direct quantitative comparison impossible. We make our implementation and ground-truth labelings available, and evaluate our technique on a very large dataset. As a result, this work is a baseline against which future work can be compared.

[1]  Nico Blodow,et al.  Perception and probabilistic anchoring for dynamic world state logging , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[2]  Julian Mason Object Discovery with a Mobile Robot , 2013 .

[3]  Nico Blodow,et al.  Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Fei-Fei Li,et al.  Object discovery in 3D scenes via shape analysis , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  Takeo Kanade,et al.  Discovering object instances from scenes of Daily Living , 2011, 2011 International Conference on Computer Vision.

[6]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Alessandro Saffiotti,et al.  Robot task planning using semantic maps , 2008, Robotics Auton. Syst..

[8]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[9]  Joachim Hertzberg,et al.  3D Mapping with Semantic Knowledge , 2005, RoboCup.

[10]  Bhaskara Marthi,et al.  An object-based semantic world model for long-term change detection and semantic querying , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Christoph H. Lampert,et al.  Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Ethan Rublee,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[13]  Patric Jensfelt,et al.  Large-scale semantic mapping and reasoning with heterogeneous modalities , 2012, 2012 IEEE International Conference on Robotics and Automation.

[14]  Hans P. Moravec,et al.  High resolution maps from wide angle sonar , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[15]  Dieter Fox,et al.  RGB-D object discovery via multi-scene analysis , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Bhaskara Marthi,et al.  Object disappearance for object discovery , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Gary R. Bradski,et al.  Fast 3D recognition and pose using the Viewpoint Feature Histogram , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Takeo Kanade,et al.  Connecting Missing Links: Object Discovery from Sparse Observations Using 5 Million Product Images , 2012, ECCV.