Creating consistent scene graphs using a probabilistic grammar

Growing numbers of 3D scenes in online repositories provide new opportunities for data-driven scene understanding, editing, and synthesis. Despite the plethora of data now available online, most of it cannot be effectively used for data-driven applications because it lacks consistent segmentations, category labels, and/or functional groupings required for co-analysis. In this paper, we develop algorithms that infer such information via parsing with a probabilistic grammar learned from examples. First, given a collection of scene graphs with consistent hierarchies and labels, we train a probabilistic hierarchical grammar to represent the distributions of shapes, cardinalities, and spatial relationships of semantic objects within the collection. Then, we use the learned grammar to parse new scenes to assign them segmentations, labels, and hierarchies consistent with the collection. During experiments with these algorithms, we find that: they work effectively for scene graphs for indoor scenes commonly found online (bedrooms, classrooms, and libraries); they outperform alternative approaches that consider only shape similarities and/or spatial relationships without hierarchy; they require relatively small sets of training data; they are robust to moderate over-segmentation in the inputs; and, they can robustly transfer labels from one data set to another. As a result, the proposed algorithms can be used to provide consistent hierarchies for large collections of scenes within the same semantic class.

[1]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[2]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[3]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Thomas A. Funkhouser,et al.  Consistent segmentation of 3D models , 2009, Comput. Graph..

[6]  Aaron Hertzmann,et al.  Learning 3D mesh segmentation and labeling , 2010, ACM Trans. Graph..

[7]  H. Seidel,et al.  A connection between partial symmetry and inverse procedural modeling , 2010, ACM Trans. Graph..

[8]  Daniel G. Aliaga,et al.  Inverse Procedural Modeling by Automatic Generation of L‐systems , 2010, Comput. Graph. Forum.

[9]  Pat Hanrahan,et al.  Context-based search for 3D models , 2010, ACM Trans. Graph..

[10]  Pat Hanrahan,et al.  Characterizing structural relationships in scenes using graph kernels , 2011, ACM Trans. Graph..

[11]  Jun Li,et al.  Symmetry Hierarchy of Man‐Made Objects , 2011, Comput. Graph. Forum.

[12]  Luc Van Gool,et al.  Procedural 3D Building Reconstruction Using Shape Grammars and Detectors , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[13]  Vladlen Koltun,et al.  Joint shape segmentation with linear programming , 2011, ACM Trans. Graph..

[14]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[15]  Daniel Cohen-Or,et al.  Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering , 2011, ACM Trans. Graph..

[16]  Leonidas J. Guibas,et al.  Probabilistic reasoning for assembly-based 3D modeling , 2011, ACM Trans. Graph..

[17]  Leonidas J. Guibas,et al.  An Optimization Approach to Improving Collections of Shape Maps , 2011, Comput. Graph. Forum.

[18]  Pat Hanrahan,et al.  Example-based synthesis of 3D object arrangements , 2012, ACM Trans. Graph..

[19]  Radomír Mech,et al.  Learning design patterns with bayesian grammar induction , 2012, UIST.

[20]  Siddhartha Chaudhuri,et al.  A probabilistic model for component-based shape synthesis , 2012, ACM Trans. Graph..

[21]  Stephen DiVerdi,et al.  Exploring collections of 3D models using fuzzy correspondences , 2012, ACM Trans. Graph..

[22]  Pat Hanrahan,et al.  Synthesizing open worlds with constraints using locally annealed reversible jump MCMC , 2012, ACM Trans. Graph..

[23]  Ligang Liu,et al.  Co‐Segmentation of 3D Shapes via Subspace Clustering , 2012, Comput. Graph. Forum.

[24]  Leonidas J. Guibas,et al.  An optimization approach for extracting and encoding consistent maps in a shape collection , 2012, ACM Trans. Graph..

[25]  Alexandre Boulch,et al.  Semantizing Complex 3D Scenes using Constrained Attribute Grammars , 2013, SGP '13.

[26]  Luc Van Gool,et al.  Bayesian Grammar Learning for Inverse Procedural Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Leonidas J. Guibas,et al.  Consistent Shape Maps via Semidefinite Programming , 2013, SGP '13.

[28]  Stephen DiVerdi,et al.  Learning part-based templates from large collections of 3D shapes , 2013, ACM Trans. Graph..

[29]  Shi-Min Hu,et al.  Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models , 2013, ACM Trans. Graph..

[30]  Daniel Cohen-Or,et al.  Layered analysis of irregular facades via symmetry maximization , 2013, ACM Trans. Graph..

[31]  Silvio Savarese,et al.  Understanding Indoor Scenes Using 3D Geometric Phrases , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Daniel Cohen-Or,et al.  Co-hierarchical analysis of shape structures , 2013, ACM Trans. Graph..

[33]  Iasonas Kokkinos,et al.  Parsing Facades with Shape Grammars and Reinforcement Learning , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Song-Chun Zhu,et al.  Scene Parsing by Integrating Function, Geometry and Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Dong-Ming Yan,et al.  Inverse procedural modeling of facade layouts , 2013, ACM Trans. Graph..

[36]  Rui Ma,et al.  Organizing heterogeneous scene collections through contextual focal points , 2014, ACM Trans. Graph..

[37]  Daniel Cohen-Or,et al.  Recurring part arrangements in shape collections , 2014, Comput. Graph. Forum.