Image parsing with graph grammars and Markov Random Fields applied to facade analysis

Existing approaches to parsing images of objects featuring complex, non-hierarchical structure rely on exploration of a large search space combining the structure of the object and positions of its parts. The latter task requires randomized or greedy algorithms that do not produce repeatable results or strongly depend on the initial solution. To address the problem we propose to model and optimize the structure of the object and position of its parts separately. We encode the possible object structures in a graph grammar. Then, for a given structure, the positions of the parts are inferred using standard MAP-MRF techniques. This way we limit the application of the less reliable greedy or randomized optimization algorithm to structure inference. We apply our method to parsing images of building facades. The results of our experiments compare favorably to the state of the art.

[1]  Luc Van Gool,et al.  Parameter-free/Pareto-driven procedural 3D reconstruction of buildings from ground-level sequences , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Renate Klempien-Hinrichs Context-free hypergraph grammars with node rewriting , 2001, Electron. Notes Theor. Comput. Sci..

[3]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[4]  Pascal Müller Procedural modeling of buildings , 2010 .

[5]  Luc Van Gool,et al.  A Three-Layered Approach to Facade Parsing , 2012, ECCV.

[6]  Nikos Paragios,et al.  High-Level Bottom-Up Cues for Top-Down Parsing of Facade Images , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[7]  Marvin A. Carlson Editor , 2015 .

[8]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Nikos Komodakis,et al.  MRF Energy Minimization and Beyond via Dual Decomposition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Grzegorz Rozenberg,et al.  Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations , 1997 .

[12]  Feng Han,et al.  Bottom-Up/Top-Down Image Parsing with Attribute Grammar , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[14]  Luc Van Gool,et al.  Learning Domain Knowledge for Façade Labelling , 2012, ECCV.

[15]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Olivier Teboul,et al.  Shape grammar parsing : application to image-based modeling , 2011 .

[17]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[18]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[19]  Luc Van Gool,et al.  Bayesian Grammar Learning for Inverse Procedural Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Nikos Komodakis,et al.  Efficient training for pairwise or higher order CRFs via dual decomposition , 2011, CVPR 2011.

[21]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Hartmut Ehrig,et al.  Handbook of graph grammars and computing by graph transformation: vol. 3: concurrency, parallelism, and distribution , 1999 .

[23]  Luc Van Gool,et al.  Procedural modeling of buildings , 2006, SIGGRAPH 2006.

[24]  Nikos Paragios,et al.  Segmentation of building facades using procedural shape priors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Vladimir Kolmogorov,et al.  Dynamic Tree Block Coordinate Ascent , 2011, ICML.

[26]  Song-Chun Zhu,et al.  Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Hayko Riemenschneider,et al.  Irregular lattices for complex shape grammar facade parsing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[29]  Iasonas Kokkinos,et al.  Shape grammar parsing via Reinforcement Learning , 2011, CVPR 2011.

[30]  Georgios Tziritas,et al.  Single view reconstruction using shape grammars for urban environments , 2009, 2009 IEEE 12th International Conference on Computer Vision.