Learning part-based spatial models for laser-vision-based room categorization

Room categorization, that is, recognizing the functionality of a never before seen room, is a crucial capability for a household mobile robot. We present a new approach for room categorization that is based on two-dimensional laser range data. The method is based on a novel spatial model consisting of mid-level parts that are built on top of a low-level part-based representation. The approach is then fused with a vision-based method for room categorization, which is also based on a spatial model consisting of mid-level visual parts. In addition, we propose a new discriminative dictionary learning technique that is applied for part-dictionary selection in both laser-based and vision-based modalities. Finally, we present a comparative analysis between laser-based, vision-based, and laser-vision-fusion-based approaches in a uniform part-based framework, which is evaluated on a large dataset with several categories of rooms from domestic environments.

[1]  Ursic Peter,et al.  Hierarchical spatial model for 2D range data based room categorization , 2016 .

[2]  Tsuhan Chen,et al.  Hierarchical object groups for scene classification , 2012, 2012 19th IEEE International Conference on Image Processing.

[3]  James M. Rehg,et al.  Visual Place Categorization: Problem, dataset, and algorithm , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Martial Hebert,et al.  An Integer Projected Fixed Point Method for Graph Matching and MAP Inference , 2009, NIPS.

[5]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[6]  Sanja Fidler,et al.  Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Lei Shi,et al.  Application of semi-supervised learning with Voronoi Graph for place classification , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Janet L. Kolodner,et al.  Maintaining Organization in a Dynamic Long-Term Memory , 1983, Cogn. Sci..

[9]  Sanja Fidler,et al.  Evaluating multi-class learning strategies in a generative hierarchical framework for object detection , 2009, NIPS.

[10]  Wolfram Burgard,et al.  Improved Techniques for Grid Mapping With Rao-Blackwellized Particle Filters , 2007, IEEE Transactions on Robotics.

[11]  Shuicheng Yan,et al.  Robust Scene Classification with Cross-Level LLC Coding on CNN Features , 2014, ACCV.

[12]  Lei Shi,et al.  Towards simultaneous place classification and object detection based on conditional random field with multiple cues , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[14]  Nuno Vasconcelos,et al.  Scene classification with semantic Fisher vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Il Hong Suh,et al.  Indoor Place Classification Using Robot Behavior and Vision Data , 2011 .

[16]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[18]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .

[19]  Gang Wang,et al.  Learning Discriminative and Shareable Features for Scene Classification , 2014, ECCV.

[20]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[21]  Wolfram Burgard,et al.  Supervised Learning of Places from Range Data using AdaBoost , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[22]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[23]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Fereshteh Sadeghi,et al.  Latent Pyramidal Regions for Recognizing Scenes , 2012, ECCV.

[25]  Pascal Vincent,et al.  Unsupervised Learning of Semantics of Object Detections for Scene Categorization , 2013, ICPRAM.

[26]  Matthew R. Boutell,et al.  Home Interior Classification using SIFT Keypoint Histograms , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Cor J. Veenman,et al.  Robust Scene Categorization by Learning Image Statistics in Context , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[28]  C. V. Jawahar,et al.  Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Qin Yan,et al.  Scene classification with improved AlexNet model , 2017, 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE).

[30]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[32]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[33]  Andrew Zisserman,et al.  Automatic Discovery and Optimization of Parts for Image Classification , 2015, ICLR.

[34]  R. Nosofsky Formal Approaches in Categorization: The generalized context model: an exemplar model of classification , 2011 .

[35]  Pedro F. Felzenszwalb,et al.  Reconfigurable models for scene recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Wolfram Burgard,et al.  Supervised semantic labeling of places using information extracted from sensor data , 2007, Robotics Auton. Syst..

[37]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[38]  Patric Jensfelt,et al.  Large-scale semantic mapping and reasoning with heterogeneous modalities , 2012, 2012 IEEE International Conference on Robotics and Automation.

[39]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[40]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[41]  U. Nunes,et al.  Real-Time Labeling of Places using Support Vector Machines , 2007, 2007 IEEE International Symposium on Industrial Electronics.

[42]  Alexei A. Efros,et al.  Data-driven visual similarity for cross-domain image matching , 2011, ACM Trans. Graph..

[43]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[44]  Dieter Fox,et al.  Voronoi Random Fields: Extracting Topological Structure of Indoor Environments via Place Labeling , 2007, IJCAI.

[45]  Alexei A. Efros,et al.  Mid-level Visual Element Discovery as Discriminative Mode Seeking , 2013, NIPS.

[46]  José-Raúl Ruiz-Sarmiento,et al.  Joint categorization of objects and rooms for mobile robots , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[47]  Janet L. Kolodner,et al.  Extending Problem Solver Capabilities Through Case-Based Inference , 1987 .

[48]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[49]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[50]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[52]  Matej Kristan,et al.  Room Categorization Based on a Hierarchical Representation of Space , 2013 .

[53]  Svetlana Lazebnik,et al.  Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[54]  Lihi Zelnik-Manor,et al.  OTC: A Novel Local Descriptor for Scene Classification , 2014, ECCV.

[55]  Sven Wachsmuth,et al.  Indoor Scene Classification Using Combined 3D and Gist Features , 2010, ACCV.

[56]  Alexei A. Efros,et al.  Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[57]  Danijel Skocaj,et al.  Room classification using a hierarchical representation of space , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[58]  Jaume Amores,et al.  Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..

[59]  Lei Shi,et al.  Laser range data based semantic labeling of places , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[60]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[61]  Ales Leonardis,et al.  Part-based room categorization for household service robots , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[62]  Hugh F. Durrant-Whyte,et al.  A solution to the simultaneous localization and map building (SLAM) problem , 2001, IEEE Trans. Robotics Autom..

[63]  Barbara Caputo,et al.  Multi-modal Semantic Place Classification , 2010, Int. J. Robotics Res..

[64]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[65]  Robert L. Winkler,et al.  Combining Probability Distributions From Experts in Risk Analysis , 1999 .

[66]  Cristiano Premebida,et al.  Applying probabilistic Mixture Models to semantic place classification in mobile robotics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[67]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[68]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[69]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.