Semantic Urban Maps

A novel region based 3D semantic mapping method is proposed for urban scenes. The proposed Semantic Urban Maps (SUM) method labels the regions of segmented images into a set of geometric and semantic classes simultaneously by employing a Markov Random Field based classification framework. The pixels in the labeled images are back-projected into a set of 3D point-clouds using stereo disparity. The point-clouds are registered together by incorporating the motion estimation and a coherent semantic map representation is obtained. SUM is evaluated on five urban benchmark sequences and is demonstrated to be successful in retrieving both geometric as well as semantic labels. The comparison with relevant state-of-art method reveals that SUM is competitive and performs better than the competing method in average pixel-wise accuracy.

[1]  Philip H. S. Torr,et al.  Automatic dense visual semantic mapping from street-level imagery , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Luc Van Gool,et al.  Segmentation-Based Urban Traffic Scene Understanding , 2009, BMVC.

[3]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Stefan Roth,et al.  Efficient Multi-cue Scene Segmentation , 2013, GCPR.

[5]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[6]  Chenliang Xu,et al.  Streaming Hierarchical Video Segmentation , 2012, ECCV.

[7]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[8]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[9]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[10]  Philip H. S. Torr,et al.  Combining Appearance and Structure from Motion Features for Road Scene Understanding , 2009, BMVC.

[11]  Yann LeCun,et al.  Semantic Road Segmentation via Multi-scale Ensembles of Learned Features , 2012, ECCV Workshops.

[12]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[13]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Siamak Khatibi,et al.  Robust place recognition with an application to semantic topological mapping , 2013, Other Conferences.

[15]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[16]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[17]  Svetlana Lazebnik,et al.  Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[18]  Ali Shahrokni,et al.  Mesh Based Semantic Modelling for Indoor and Outdoor Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  J. Rafid Siddiqui,et al.  Semantic indoor maps , 2013, 2013 28th International Conference on Image and Vision Computing New Zealand (IVCNZ 2013).

[20]  Alexei A. Efros,et al.  Geometric context from a single image , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[21]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23]  Antonio Torralba,et al.  Nonparametric scene parsing: Label transfer via dense scene alignment , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[26]  David W. Murray,et al.  Parallel Tracking and Mapping on a camera phone , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[27]  Martial Hebert,et al.  3-D scene analysis via sequenced predictions over points and regions , 2011, 2011 IEEE International Conference on Robotics and Automation.

[28]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.