论文信息 - ATLAS: A Three-Layered Approach to Facade Parsing

ATLAS: A Three-Layered Approach to Facade Parsing

We propose a novel approach for semantic segmentation of building facades. Our system consists of three distinct layers, representing different levels of abstraction in facade images: segments, objects and architectural elements. In the first layer, the facade is segmented into regions, each of which is assigned a probability distribution over semantic classes. We evaluate different state-of-the-art segmentation and classification strategies to obtain the initial probabilistic semantic labeling. In the second layer, we investigate the performance of different object detectors and show the benefit of using such detectors to improve our initial labeling. The generic approaches of the first two layers are then specialized for the task of facade labeling in the third layer. There, we incorporate additional meta-knowledge in the form of weak architectural principles, which enforces architectural plausibility and consistency on the final reconstruction. Rigorous tests performed on two existing datasets of building facades demonstrate that we outperform the current state of the art, even when using outputs from lower layers of the pipeline. Finally, we demonstrate how the output of the highest layer can be used to create a procedural building reconstruction.

[1] Luc Van Gool,et al. Traffic sign recognition — How far are we from the solution? , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[2] Jianxiong Xiao,et al. Image-based street-side city modeling , 2009, SIGGRAPH 2009.

[3] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[4] Luc Van Gool,et al. Learning Domain Knowledge for Façade Labelling , 2012, ECCV.

[5] Daphne Koller,et al. Efficiently selecting regions for scene understanding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] Antonio Torralba,et al. Nonparametric Scene Parsing via Label Transfer , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Christopher Rasmussen,et al. Analysis of Building Textures for Reconstructing Partially Occluded Facades , 2008, ECCV.

[8] Daniel G. Aliaga,et al. A Survey of Urban Reconstruction , 2013, Comput. Graph. Forum.

[9] Martial Hebert,et al. Exploiting Inference for Approximate Parameter Learning in Discriminative Fields: An Empirical Study , 2005, EMMCVPR.

[10] Luc Van Gool,et al. Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Howard B. Demuth,et al. Neutral network toolbox for use with Matlab , 1995 .

[12] Nikos Paragios,et al. High-Level Bottom-Up Cues for Top-Down Parsing of Facade Images , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[13] Radim Sára,et al. Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[14] Joachim Denzler,et al. Semantic Segmentation with Millions of Features: Integrating Multiple Cues in a Combined Random Forest Approach , 2012, ACCV.

[15] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[16] Shi-Min Hu,et al. Adaptive partitioning of urban facades , 2011, SA '11.

[17] Chao Yang,et al. Parsing façade with rank-one approximation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[19] Svetlana Lazebnik,et al. Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[20] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[21] Bastian Leibe,et al. Multi-Class Image Labeling with Top-Down Segmentation and Generalized Robust $P^N$ Potentials , 2011, BMVC.

[22] Luc Van Gool,et al. Image-based procedural modeling of facades , 2007, SIGGRAPH 2007.

[23] Horst Bischof,et al. Unsupervised Facade Segmentation Using Repetitive Patterns , 2010, DAGM-Symposium.

[24] Luc Van Gool,et al. Procedural 3D Building Reconstruction Using Shape Grammars and Detectors , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[25] Qinping Zhao,et al. Rectilinear parsing of architecture in urban environment , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26] Luc Van Gool,et al. Is There a Procedural Logic to Architecture? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Philip H. S. Torr,et al. What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[28] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29] Pietro Perona,et al. Integral Channel Features , 2009, BMVC.

[30] Roberto Cipolla,et al. Modelling and Interpretation of Architecture from Several Images , 2004, International Journal of Computer Vision.

[31] Luc Van Gool,et al. Parameter-free/Pareto-driven procedural 3D reconstruction of buildings from ground-level sequences , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Bernt Schiele,et al. An Implicit Shape Model for Combined Object Categorization and Segmentation , 2006, Toward Category-Level Object Recognition.

[33] Luc Van Gool,et al. SEEDS: Superpixels Extracted Via Energy-Driven Sampling , 2012, International Journal of Computer Vision.

[34] Roger Fletcher,et al. Practical methods of optimization; (2nd ed.) , 1987 .

[35] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[36] L. Van Gool,et al. AUTOMATIC ARCHITECTURAL STYLE RECOGNITION , 2012 .

[37] Long Quan,et al. Quasi-regular Facade Structure Extraction , 2012, ACCV.

[38] Jianxiong Xiao,et al. Image-based façade modeling , 2008, ACM Trans. Graph..

[39] Sebastian Nowozin,et al. On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation , 2010, ECCV.

[40] Daniel P. Huttenlocher,et al. Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[41] Jianxiong Xiao,et al. Image-based façade modeling , 2008, SIGGRAPH 2008.

[42] Stephen Gould,et al. Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[43] S. Süsstrunk,et al. SLIC Superpixels ? , 2010 .

[44] Sylvia Richardson,et al. Markov Chain Monte Carlo in Practice , 1997 .

[45] Feng Han,et al. Bottom-Up/Top-Down Image Parsing with Attribute Grammar , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Calvin C. Zhao. Critical Review : Contour Detection and Hierarchical Image Segmentation , 2015 .

[47] Jianxiong Xiao,et al. Image-based street-side city modeling , 2009, ACM Trans. Graph..

[48] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[49] Ming C. Lin,et al. Example-guided physically based modal sound synthesis , 2013, ACM Trans. Graph..

[50] Daniel G. Aliaga,et al. Ieee Transactions on Visualization and Computer Graphics 1 Style Grammars for Interactive Visualization of Architecture , 2022 .

[51] Andreas Wendel,et al. Façade Segmentation in a Multi-view Scenario , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[52] H. Seidel,et al. A connection between partial symmetry and inverse procedural modeling , 2010, SIGGRAPH 2010.

[53] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[54] Antonio Criminisi,et al. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[55] Kusum Deep,et al. A real coded genetic algorithm for solving integer and mixed integer optimization problems , 2009, Appl. Math. Comput..

[56] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57] Hayko Riemenschneider,et al. Irregular lattices for complex shape grammar facade parsing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[58] Eli Shechtman,et al. Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[59] Claus Brenner,et al. Reconstruction of Façade Structures Using a Formal Grammar and RjMCMC , 2006, DAGM-Symposium.

[60] Iasonas Kokkinos,et al. Parsing Facades with Shape Grammars and Reinforcement Learning , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61] Svetlana Lazebnik,et al. Finding Things: Image Parsing with Regions and Per-Exemplar Detectors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[62] Wolfgang Förstner,et al. Regionwise Classification of Building Facade Images , 2011, PIA.

[63] Luc Van Gool,et al. Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[64] Nikos Paragios,et al. Segmentation of building facades using procedural shape priors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[65] Luc Van Gool,et al. Seeking the Strongest Rigid Detector , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[66] Joachim Denzler,et al. Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition , 2013, Machine Vision and Applications.

[67] R. Fletcher. Practical Methods of Optimization , 1988 .

[68] Johannes Stallkamp,et al. Detection of traffic signs in real-world images: The German traffic sign detection benchmark , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[69] Ben Taskar,et al. Learning structured prediction models: a large margin approach , 2005, ICML.

[70] Olga Veksler,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[71] Andrew Zisserman,et al. Metric rectification for perspective images of planes , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[72] Luc Van Gool,et al. Bayesian Grammar Learning for Inverse Procedural Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[73] Daniel Cohen-Or,et al. Layered analysis of irregular facades via symmetry maximization , 2013, ACM Trans. Graph..

[74] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[75] Dong-Ming Yan,et al. Inverse procedural modeling of facade layouts , 2013, ACM Trans. Graph..

[76] Frank Dellaert,et al. A Probabilistic Approach to the Semantic Interpretation of Building Facades , 2004 .

[77] Jingdong Wang,et al. Graph based image segmentation , 2007 .

[78] Luc Van Gool,et al. A Three-Layered Approach to Facade Parsing , 2012, ECCV.

[79] Wolfgang Förstner,et al. A hierarchical conditional random field model for labeling and classifying images of man-made scenes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[80] Wolfgang Förstner,et al. eTRIMS Image Database for Interpreting Images of Man-Made Scenes , 2009 .

[81] Bernt Schiele,et al. A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes , 2008, ECCV.

[82] Daphne Koller,et al. Learning Spatial Context: Using Stuff to Find Things , 2008, ECCV.

[83] Derek Hoiem,et al. Learning CRFs Using Graph Cuts , 2008, ECCV.