论文信息 - 3D all the way: Semantic segmentation of urban scenes from start to end in 3D

3D all the way: Semantic segmentation of urban scenes from start to end in 3D

We propose a new approach for semantic segmentation of 3D city models. Starting from an SfM reconstruction of a street-side scene, we perform classification and facade splitting purely in 3D, obviating the need for slow image-based semantic segmentation methods. We show that a properly trained pure-3D approach produces high quality labelings, with significant speed benefits (20x faster) allowing us to analyze entire streets in a matter of minutes. Additionally, if speed is not of the essence, the 3D labeling can be combined with the results of a state-of-the-art 2D classifier, further boosting the performance. Further, we propose a novel facade separation based on semantic nuances between facades. Finally, inspired by the use of architectural principles for 2D facade labeling, we propose new 3D-specific principles and an efficient optimization scheme based on an integer quadratic programming formulation.

[1] Chao Yang,et al. Parsing façade with rank-one approximation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Jitendra Malik,et al. Parsing Images of Architectural Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3] Hans-Peter Seidel,et al. A Correlated Parts Model for Object Detection in Large 3D Scans , 2013, Comput. Graph. Forum.

[4] Qinping Zhao,et al. Rectilinear parsing of architecture in urban environment , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] András Bódis-Szomorú,et al. Fast, Approximate Piecewise-Planar Modeling Based on Sparse Structure-from-Motion and Superpixels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Luc Van Gool,et al. A Three-Layered Approach to Facade Parsing , 2012, ECCV.

[8] Vladimir Kolmogorov,et al. What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Andrew E. Johnson,et al. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[10] Aaron Hertzmann,et al. Learning 3D mesh segmentation and labeling , 2010, SIGGRAPH 2010.

[11] Luc Van Gool,et al. Parameter-free/Pareto-driven procedural 3D reconstruction of buildings from ground-level sequences , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Luc Van Gool,et al. Procedural modeling of buildings , 2006, ACM Trans. Graph..

[13] Ali Shahrokni,et al. Mesh Based Semantic Modelling for Indoor and Outdoor Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Silvio Savarese,et al. 3D Scene Understanding by Voxel-CRF , 2013, 2013 IEEE International Conference on Computer Vision.

[15] Torsten Sattler,et al. Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[16] Jianxiong Xiao,et al. Image-based façade modeling , 2008, ACM Trans. Graph..

[17] Vladimir G. Kim,et al. Shape-based recognition of 3D point clouds in urban environments , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18] Radim Sára,et al. Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[19] Tomás Pajdla,et al. Multi-view reconstruction preserving weakly-supported surfaces , 2011, CVPR 2011.

[20] Hayko Riemenschneider,et al. Irregular lattices for complex shape grammar facade parsing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Luc Van Gool,et al. Learning Domain Knowledge for Façade Labelling , 2012, ECCV.

[22] Marc Pollefeys,et al. Efficient Structured Parsing of Facades Using Dynamic Programming , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Jianxiong Xiao,et al. Multiple view semantic segmentation for street view images , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24] Federico Tombari,et al. On the Affinity between 3D Detectors and Descriptors , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[25] Frank Dellaert,et al. A Probabilistic Approach to the Semantic Interpretation of Building Facades , 2004 .

[26] Delbert Dueck,et al. Clustering by Passing Messages Between Data Points , 2007, Science.

[27] Luc Van Gool,et al. Superpixel meshes for fast edge-preserving surface reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Ruigang Yang,et al. Semantic Segmentation of Urban Scenes Using Dense Depth Maps , 2010, ECCV.

[29] Luc Van Gool,et al. Image-based procedural modeling of facades , 2007, SIGGRAPH 2007.

[30] Roberto Cipolla,et al. Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Martial Hebert,et al. Co-inference for Multi-modal Scene Analysis , 2012, ECCV.

[32] Jianxiong Xiao,et al. Sliding Shapes for 3D Object Detection in Depth Images , 2014, ECCV.

[33] Olga Veksler,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[34] Ben Taskar,et al. Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[35] Roberto Cipolla,et al. Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[36] Shi-Min Hu,et al. Adaptive partitioning of urban facades , 2011, SA '11.

[37] Luc Van Gool,et al. Scene Cut: Class-Specific Object Detection and Segmentation in 3D Scenes , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[38] W. F. Clocksin,et al. Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2012, International Journal of Computer Vision.

[39] Vladimir Kolmogorov,et al. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] Ke Xie,et al. A search-classify approach for cluttered indoor scene understanding , 2012, ACM Trans. Graph..

[41] Leonidas J. Guibas,et al. Shape google: Geometric words and expressions for invariant shape retrieval , 2011, TOGS.

[42] Iasonas Kokkinos,et al. Parsing Facades with Shape Grammars and Reinforcement Learning , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43] Olaf Kähler,et al. Efficient 3D Scene Labeling Using Fields of Trees , 2013, 2013 IEEE International Conference on Computer Vision.

[44] Renaud Marlet,et al. Image parsing with graph grammars and Markov Random Fields applied to facade analysis , 2014, IEEE Winter Conference on Applications of Computer Vision.

[45] Luc Van Gool,et al. Bayesian Grammar Learning for Inverse Procedural Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[46] Luc Van Gool,et al. Learning Where to Classify in Multi-view Semantic Segmentation , 2014, ECCV.

[47] Nikos Paragios,et al. Segmentation of building facades using procedural shape priors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48] Zhengyou Zhang,et al. Parameter estimation techniques: a tutorial with application to conic fitting , 1997, Image Vis. Comput..

[49] Josiane Zerubia,et al. Structural Approach for Building Reconstruction from a Single DSM , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50] Olga Sorkine-Hornung,et al. Object detection and classification from large‐scale cluttered indoor scans , 2014, Comput. Graph. Forum.

[51] Luc Van Gool,et al. Depth-From-Recognition: Inferring Meta-data by Cognitive Feedback , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[52] Horst Bischof,et al. Unsupervised Facade Segmentation Using Repetitive Patterns , 2010, DAGM-Symposium.

[53] Luc Van Gool,et al. Is There a Procedural Logic to Architecture? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54] Changchang Wu,et al. Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.