Semantic mapping using object-class segmentation of RGB-D images

For task planning and execution in unstructured environments, a robot needs the ability to recognize and localize relevant objects. When this information is made persistent in a semantic map, it can be used, e. g., to communicate with humans. In this paper, we propose a novel approach to learning such maps. Our approach registers measurements of RGB-D cameras by means of simultaneous localization and mapping. We employ random decision forests to segment object classes in images and exploit dense depth measurements to obtain scale-invariance. Our object recognition method integrates shape and texture seamlessly. The probabilistic segmentation from multiple views is filtered in a voxel-based 3D map using a Bayesian framework. We report on the quality of our object-class segmentation method and demonstrate the benefits in accuracy when fusing multiple views in a semantic map.

[1]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Cristian Sminchisescu,et al.  Image segmentation by figure-ground composition into maximal cliques , 2011, 2011 International Conference on Computer Vision.

[3]  Jörg Stückler,et al.  Combining depth and color cues for scale- and viewpoint-invariant object segmentation and recognition using Random Forests , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Cordelia Schmid,et al.  Accurate Object Localization with Shape Masks , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Joachim Hertzberg,et al.  Towards semantic maps for mobile robots , 2008, Robotics Auton. Syst..

[6]  Roland Siegwart,et al.  Cognitive maps for mobile robots - an object based approach , 2007, Robotics Auton. Syst..

[7]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  Jörg Stückler,et al.  Robust Real-Time Registration of RGB-D Images using Multi-Resolution Surfel Representations , 2012, ROBOTIK.

[9]  Javier Civera,et al.  Towards semantic SLAM using a monocular camera , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  W. Burgard,et al.  Real-time 3 D visual SLAM with a hand-held RGB-D camera , 2011 .

[15]  David W. Murray,et al.  Combining monoSLAM with object recognition for scene augmentation using a wearable camera , 2010, Image Vis. Comput..

[16]  Masahiro Tomono Shin Object-based Localization and Mapping using Loop Constraints and Geometric Prior Knowledge , 2003 .

[17]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Dieter Fox,et al.  RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments , 2010, ISER.

[19]  Sven Behnke,et al.  Learning Object-Class Segmentation with Convolutional Neural Networks , 2012, ESANN.

[20]  Wolfram Burgard,et al.  Conceptual spatial representations for indoor mobile robots , 2008, Robotics Auton. Syst..

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Hugh F. Durrant-Whyte,et al.  Mobile robot localization by tracking geometric beacons , 1991, IEEE Trans. Robotics Autom..

[23]  C. Stachniss,et al.  Semantic Modeling of Places using Objects , 2008 .

[24]  James J. Little,et al.  Curious George: An attentive semantic robot , 2008, Robotics Auton. Syst..

[25]  Antonio Criminisi,et al.  Object Class Segmentation using Random Forests , 2008, BMVC.

[26]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[27]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[28]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.