Efficient 2D and 3D Facade Segmentation Using Auto-Context

This paper introduces a fast and efficient segmentation technique for 2D images and 3D point clouds of building facades. Facades of buildings are highly structured and consequently most methods that have been proposed for this problem aim to make use of this strong prior information. Contrary to most prior work, we are describing a system that is almost domain independent and consists of standard segmentation methods. We train a sequence of boosted decision trees using auto-context features. This is learned using stacked generalization. We find that this technique performs better, or comparable with all previous published methods and present empirical results on all available 2D and 3D facade benchmark datasets. The proposed method is simple to implement, easy to extend, and very efficient at test-time inference.

[1]  Iasonas Kokkinos,et al.  Shape grammar parsing via Reinforcement Learning , 2011, CVPR 2011.

[2]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[3]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[4]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[6]  Luc Van Gool,et al.  A Three-Layered Approach to Facade Parsing , 2012, ECCV.

[7]  Stephen Gould DARWIN: a framework for machine learning and computer vision research and development , 2012, J. Mach. Learn. Res..

[8]  Claus Brenner,et al.  Reconstruction of Façade Structures Using a Formal Grammar and RjMCMC , 2006, DAGM-Symposium.

[9]  Nikos Paragios,et al.  High-Level Bottom-Up Cues for Top-Down Parsing of Facade Images , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[10]  Peter V. Gehler,et al.  Efficient Facade Segmentation Using Auto-context , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[11]  Joachim Denzler,et al.  Semantic Segmentation with Millions of Features: Integrating Multiple Cues in a Combined Random Forest Approach , 2012, ACCV.

[12]  Renaud Marlet,et al.  Image parsing with graph grammars and Markov Random Fields applied to facade analysis , 2014, IEEE Winter Conference on Applications of Computer Vision.

[13]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Renaud Marlet,et al.  A MRF shape prior for facade parsing with occlusions , 2015, CVPR.

[15]  Svetlana Lazebnik,et al.  Superparsing , 2010, International Journal of Computer Vision.

[16]  Luc Van Gool,et al.  3D all the way: Semantic segmentation of urban scenes from start to end in 3D , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Philip H. S. Torr,et al.  What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[18]  Marc Pollefeys,et al.  Efficient Structured Parsing of Facades Using Dynamic Programming , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Wolfgang Förstner,et al.  eTRIMS Image Database for Interpreting Images of Man-Made Scenes , 2009 .

[20]  Luc Van Gool,et al.  Learning Where to Classify in Multi-view Semantic Segmentation , 2014, ECCV.

[21]  Renaud Marlet,et al.  Beyond Procedural Facade Parsing: Bidirectional Alignment via Linear Programming , 2014, ACCV.

[22]  Joachim Denzler,et al.  A Fast Approach for Pixelwise Labeling of Facade Images , 2010, 2010 20th International Conference on Pattern Recognition.

[23]  Wolfgang Förstner,et al.  A hierarchical conditional random field model for labeling and classifying images of man-made scenes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[24]  Zhuowen Tu,et al.  Auto-context and its application to high-level vision tasks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Sebastian Nowozin,et al.  Optimal Decisions from Probabilistic Models: The Intersection-over-Union Case , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Luc Van Gool,et al.  Bayesian Grammar Learning for Inverse Procedural Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Luc Van Gool,et al.  ATLAS: A Three-Layered Approach to Facade Parsing , 2016, International Journal of Computer Vision.

[28]  Hayko Riemenschneider,et al.  Irregular lattices for complex shape grammar facade parsing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[30]  Radim Sára,et al.  Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[31]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Carlo Gatta,et al.  Stacked Sequential Scale-SpaceTaylor Context , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[35]  Nikos Paragios,et al.  Learning Grammars for Architecture-Specific Facade Parsing , 2016, International Journal of Computer Vision.