Weakly Supervised Labeling of Dominant Image Regions in Indoor Sequences

The capability of associating semantic concepts with available sensory data is an important component of environment understanding. In this work we describe an approach for annotation of dominant image regions of uniform appearance, which are typically encountered indoors, such as doors, walls and floors. One of the main challenges behind correct classification of these regions requires handling large changes in the appearance as a function of lighting conditions. Instead of using large amount of training data taken under different illumination conditions, we propose an online updating of the model learned from a small number of training examples in the initial frame. We follow a two stage classification strategy: first we estimate the probabilities of individual regions belonging to each class based on appearance only; in the second stage we use Markov Random Fields (MRF) to exploit spatial layout of the scene and improve classification results. The appearance model learned in the first frame is updated in subsequent frames using the confidences obtained by the two stage classification strategy. We demonstrate our approach on two sequences of indoor environments.

[1]  Cordelia Schmid,et al.  Using High-Level Visual Information for Color Constancy , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Lin Yang,et al.  Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Dieter Fox,et al.  A Spatio-Temporal Probabilistic Model for Multi-Sensor Multi-Class Object Recognition , 2007, ISRR.

[4]  Barbara Caputo,et al.  Incremental learning for place recognition in dynamic environments , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Sebastian Thrun,et al.  Detecting and modeling doors with mobile robots , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[7]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Antonio Criminisi,et al.  Single-Histogram Class Models for Image Segmentation , 2006, ICVGIP.

[9]  Wolfram Burgard,et al.  Semantic labeling of places , 2005 .

[10]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[11]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[12]  Barbara Caputo,et al.  Visual Servoing to Help Camera Operators Track Better , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Dieter Fox,et al.  Relational Object Maps for Mobile Robots , 2005, IJCAI.

[14]  Paul Newman,et al.  Fast Probabilistic Labeling of City Maps , 2008, Robotics: Science and Systems.

[15]  Sebastian Thrun,et al.  Self-supervised Monocular Road Detection in Desert Terrain , 2006, Robotics: Science and Systems.

[16]  Tomás Werner,et al.  A Linear Programming Approach to Max-Sum Problem: A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Roland Siegwart,et al.  Cognitive maps for mobile robots - an object based approach , 2007, Robotics Auton. Syst..

[18]  Andrew Zisserman,et al.  Incremental learning of object detectors using a visual shape alphabet , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Stephen Gould,et al.  Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.