Learning Generative Models of 3D Structures

3D models of objects and scenes are critical to many academic disciplines and industrial applications. Of particular interest is the emerging opportunity for 3D graphics to serve artificial intelligence: computer vision systems can benefit from syntheticallygenerated training data rendered from virtual 3D scenes, and robots can be trained to navigate in and interact with real-world environments by first acquiring skills in simulated ones. One of the most promising ways to achieve this is by learning and applying generative models of 3D content: computer programs that can synthesize new 3D shapes and scenes. To allow users to edit and manipulate the synthesized 3D content to achieve their goals, the generative model should also be structure-aware: it should express 3D shapes and scenes using abstractions that allow manipulation of their high-level structure. This state-of-theart report surveys historical work and recent progress on learning structure-aware generative models of 3D shapes and scenes. We present fundamental representations of 3D shape and scene geometry and structures, describe prominent methodologies including probabilistic models, deep generative models, program synthesis, and neural networks for structured data, and cover many recent methods for structure-aware synthesis of 3D shapes and indoor scenes. CCS Concepts • Computing methodologies → Structure-aware generative models; Representation of structured data; Deep learning; Neural networks; Shape and scene synthesis; Hierarchical models; † Corresponding author: kevin.kai.xu@gmail.com c © 2020 The Author(s) Computer Graphics Forum c © 2020 The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd. S. Chaudhuri, D. Ritchie, J. Wu, K. Xu, & H. Zhang / Learning Generative Models of 3D Structures

[1]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[2]  Leonidas J. Guibas,et al.  Probabilistic reasoning for assembly-based 3D modeling , 2011, ACM Trans. Graph..

[3]  Yun Jiang,et al.  Learning Object Arrangements in 3D Scenes using Human Context , 2012, ICML.

[4]  Thomas Funkhouser,et al.  Deep Structured Implicit Functions , 2019, ArXiv.

[5]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[7]  Balazs Kovacs,et al.  Learning Material-Aware Local Descriptors for 3D Shapes , 2018, 2018 International Conference on 3D Vision (3DV).

[8]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[9]  Jiajun Wu,et al.  Visual Object Networks: Image Generation with Disentangled 3D Representations , 2018, NeurIPS.

[10]  Siddhartha Chaudhuri,et al.  A probabilistic model for component-based shape synthesis , 2012, ACM Trans. Graph..

[11]  Rui Tang,et al.  Data-driven interior plan generation for residential buildings , 2019, ACM Trans. Graph..

[12]  Armando Solar-Lezama,et al.  Program synthesis by sketching , 2008 .

[13]  Pat Hanrahan,et al.  Generating Design Suggestions under Tight Constraints with Gradient‐based Probabilistic Programming , 2015, Comput. Graph. Forum.

[14]  Szymon Rusinkiewicz,et al.  Modeling by example , 2004, ACM Trans. Graph..

[15]  Steven J. Gortler,et al.  Geometry images , 2002, SIGGRAPH.

[16]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[17]  Leonidas J. Guibas,et al.  PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Silvio Savarese,et al.  Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings , 2018, ACCV.

[19]  Swarat Chaudhuri,et al.  Neural Sketch Learning for Conditional Program Generation , 2017, ICLR.

[20]  Thomas A. Funkhouser,et al.  Learning Shape Templates With Structured Implicit Functions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Butler W. Lampson,et al.  A Machine Learning Framework for Programming by Example , 2013, ICML.

[22]  Chi-Keung Tang,et al.  Make it home: automatic optimization of furniture arrangement , 2011, SIGGRAPH 2011.

[23]  Siddhartha Chaudhuri,et al.  Attribit: content creation with semantic attributes , 2013, UIST.

[24]  Luc Van Gool,et al.  Image-based procedural modeling of facades , 2007, ACM Trans. Graph..

[25]  Jiajun Wu,et al.  Learning Generative Models of 3D Structures , 2020, Comput. Graph. Forum.

[26]  Pascal Müller Procedural modeling of cities , 2006, SIGGRAPH Courses.

[27]  Leonidas J. Guibas,et al.  Data-driven structural priors for shape completion , 2015, ACM Trans. Graph..

[28]  Armando Solar-Lezama,et al.  Learning to Infer Graphics Programs from Hand-Drawn Images , 2017, NeurIPS.

[29]  Rui Ma,et al.  Organizing heterogeneous scene collections through contextual focal points , 2014, ACM Trans. Graph..

[30]  Daniel Cohen-Or,et al.  GRAINS , 2018, ACM Trans. Graph..

[31]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Pat Hanrahan,et al.  Synthesizing open worlds with constraints using locally annealed reversible jump MCMC , 2012, ACM Trans. Graph..

[33]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[34]  Yizhou Yu Laplacian Guided Editing, Synthesis, and Simulation , 2007 .

[35]  Pierre Vandergheynst,et al.  Geodesic Convolutional Neural Networks on Riemannian Manifolds , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[36]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[37]  H. Seidel,et al.  A connection between partial symmetry and inverse procedural modeling , 2010, ACM Trans. Graph..

[38]  Ersin Yumer,et al.  3D-PRNN: Generating Shape Primitives with Recurrent Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Rui Ma,et al.  Action-driven 3D indoor scene evolution , 2016, ACM Trans. Graph..

[40]  Vladlen Koltun,et al.  Computer-generated residential building layouts , 2010, SIGGRAPH 2010.

[41]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[42]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[43]  Leonidas J. Guibas,et al.  Language-driven synthesis of 3D scenes from scene databases , 2018, ACM Trans. Graph..

[44]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Mathieu Aubry,et al.  Learning elementary structures for 3D shape generation and matching , 2019, NeurIPS.

[46]  Evangelos Kalogerakis,et al.  Eurographics Symposium on Geometry Processing 2015 Analysis and Synthesis of 3d Shape Families via Deep-learned Generative Models of Surfaces , 2022 .

[47]  Jitendra Malik,et al.  Mesh R-CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Lubin Fan,et al.  A Probabilistic Model for Exteriors of Residential Buildings , 2016, ACM Trans. Graph..

[49]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[50]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[52]  Lior Wolf,et al.  Automatic Program Synthesis of Long Programs with a Learned Garbage Collector , 2018, NeurIPS.

[53]  Leonidas J. Guibas,et al.  GRASS: Generative Recursive Autoencoders for Shape Structures , 2017, ACM Trans. Graph..

[54]  Thomas A. Funkhouser,et al.  Interactive 3D Modeling with a Generative Adversarial Network , 2017, 2017 International Conference on 3D Vision (3DV).

[55]  Jun Li,et al.  Im2Struct: Recovering 3D Shape Structure from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[57]  Armando Solar-Lezama,et al.  Write, Execute, Assess: Program Synthesis with a REPL , 2019, NeurIPS.

[58]  Leonidas J. Guibas,et al.  A scalable active framework for region annotation in 3D shape collections , 2016, ACM Trans. Graph..

[59]  Prakhar Jaiswal,et al.  Assembly-based conceptual 3D modeling with unlabeled components using probabilistic factor graph , 2016, Comput. Aided Des..

[60]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[61]  Sebastian Nowozin,et al.  DeepCoder: Learning to Write Programs , 2016, ICLR.

[62]  Angel X. Chang,et al.  PlanIT , 2019, ACM Transactions on Graphics.

[63]  Dan Grossman,et al.  Using E-Graphs for CAD Parameter Inference , 2019, ArXiv.

[64]  Lihong Li,et al.  Neuro-Symbolic Program Synthesis , 2016, ICLR.

[65]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[66]  Stefano Ermon,et al.  Graphite: Iterative Generative Modeling of Graphs , 2018, ICML.

[67]  Matthias Nießner,et al.  PiGraphs , 2016, ACM Trans. Graph..

[68]  Hans-Peter Seidel,et al.  Exploring Shape Variations by 3D‐Model Decomposition and Part‐based Recombination , 2012, Comput. Graph. Forum.

[69]  Kai Liu,et al.  Model-driven indoor scenes modeling from a single image , 2015, Graphics Interface.

[70]  Dani Lischinski,et al.  SAGNet , 2018, ACM Trans. Graph..

[71]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[73]  Noah D. Goodman,et al.  Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation , 2011, AISTATS.

[74]  Frank D. Wood,et al.  A Compilation Target for Probabilistic Programming Languages , 2014, ICML.

[75]  ZhaoXi,et al.  Relationship templates for creating scene variations , 2016 .

[76]  Demetri Terzopoulos,et al.  The Clutterpalette: An Interactive Tool for Detailing Indoor Scenes , 2016, IEEE Transactions on Visualization and Computer Graphics.

[77]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Pushmeet Kohli,et al.  RobustFill: Neural Program Learning under Noisy I/O , 2017, ICML.

[79]  Evangelos Kalogerakis,et al.  SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[80]  Pascal Müller,et al.  Procedural modeling of cities , 2001, SIGGRAPH.

[81]  Alla Sheffer,et al.  Model Composition from Interchangeable Components , 2007, 15th Pacific Conference on Computer Graphics and Applications (PG'07).

[82]  Daniel Cohen-Or,et al.  Global-to-local generative model for 3D shapes , 2018, ACM Trans. Graph..

[83]  Leonidas J. Guibas,et al.  Composite Shape Modeling via Latent Space Factorization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[84]  Pat Hanrahan,et al.  Example-based synthesis of 3D object arrangements , 2012, ACM Trans. Graph..

[85]  Joshua B. Tenenbaum,et al.  Picture: A probabilistic programming language for scene perception , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86]  Przemyslaw Prusinkiewicz,et al.  The Algorithmic Beauty of Plants , 1990, The Virtual Laboratory.

[87]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[88]  Armando Solar-Lezama,et al.  Library learning for neurally-guided Bayesian program induction , 2018, NIPS 2018.

[89]  Victor S. Lempitsky,et al.  Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[90]  Joshua B. Tenenbaum,et al.  Church: a language for generative models , 2008, UAI.

[91]  Daniel Cohen-Or,et al.  Meta-representation of shape families , 2014, ACM Trans. Graph..

[92]  Luc Van Gool,et al.  Procedural modeling of buildings , 2006, SIGGRAPH 2006.

[93]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[94]  Maneesh Agrawala,et al.  Interactive furniture layout using interior design guidelines , 2011, SIGGRAPH 2011.

[95]  Jiajun Wu,et al.  Learning to Infer and Execute 3D Shape Programs , 2019, ICLR.

[96]  Matthias Nießner,et al.  Scan2Mesh: From Unstructured Range Scans to 3D Meshes , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[97]  H. Zhang,et al.  Learning 3D Scene Synthesis from Annotated RGB‐D Images , 2016, Comput. Graph. Forum.

[98]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[99]  Oliver Grau,et al.  VConv-DAE: Deep Volumetric Shape Learning Without Object Labels , 2016, ECCV Workshops.

[100]  Leonidas J. Guibas,et al.  Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  Angel X. Chang,et al.  Deep convolutional priors for indoor scene synthesis , 2018, ACM Trans. Graph..

[102]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[103]  Kai Xu,et al.  Learning Part Generation and Assembly for Structure-aware Shape Synthesis , 2019, AAAI.

[104]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[105]  Rishabh Singh,et al.  Robust Text-to-SQL Generation with Execution-Guided Decoding , 2018, 1807.03100.

[106]  Rui Ma,et al.  Analysis and Modeling of 3D Indoor Scenes , 2017, ArXiv.

[107]  Radomír Mech,et al.  Learning design patterns with bayesian grammar induction , 2012, UIST.

[108]  Marsha Chechik,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 2016, Lecture Notes in Computer Science.

[109]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[110]  Jure Leskovec,et al.  GraphRNN: A Deep Generative Model for Graphs , 2018, ICML 2018.

[111]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[112]  Leonidas J. Guibas,et al.  ComplementMe , 2017, ACM Trans. Graph..

[113]  Kai Wang,et al.  Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  Chongyang Ma,et al.  Deep Generative Modeling for Scene Synthesis via Hybrid Representations , 2018, ACM Trans. Graph..

[115]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[116]  Noah D. Goodman,et al.  Inducing Probabilistic Programs by Bayesian Program Merging , 2011, ArXiv.

[117]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[118]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[119]  Matthias Nießner,et al.  Activity-centric scene synthesis for functional 3D scene modeling , 2015, ACM Trans. Graph..

[120]  Pat Hanrahan,et al.  Characterizing structural relationships in scenes using graph kernels , 2011, ACM Trans. Graph..

[121]  Michael J. Black,et al.  Generating 3D faces using Convolutional Mesh Autoencoders , 2018, ECCV.

[122]  Jiangping Wang,et al.  Structure-Aware Shape Synthesis , 2018, 2018 International Conference on 3D Vision (3DV).

[123]  Shi-Min Hu,et al.  Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models , 2013, ACM Trans. Graph..

[124]  Daniel Ritchie,et al.  Example‐based Authoring of Procedural Modeling Programs with Structural and Continuous Variability , 2018, Comput. Graph. Forum.

[125]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[126]  Wojciech Matusik,et al.  InverseCSG: automatic conversion of 3D models to CSG trees , 2019, ACM Trans. Graph..

[127]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[128]  Jiajun Wu,et al.  Synthesizing 3D Shapes via Modeling Multi-view Depth Maps and Silhouettes with Deep Generative Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[129]  Jiajun Wu,et al.  Learning to Describe Scenes with Programs , 2018, ICLR.

[130]  Luc Van Gool,et al.  Bayesian Grammar Learning for Inverse Procedural Modeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[131]  Radomír Mech,et al.  Metropolis procedural modeling , 2011, TOGS.

[132]  Pat Hanrahan,et al.  Controlling procedural modeling programs with stochastically-ordered sequential Monte Carlo , 2015, ACM Trans. Graph..

[133]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[134]  Subhransu Maji,et al.  CSGNet: Neural Shape Parser for Constructive Solid Geometry , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[135]  Anders P. Eriksson,et al.  Deep Level Sets: Implicit Surface Representations for 3D Shape Inference , 2019, ArXiv.

[136]  Chi-Keung Tang,et al.  Make it home: automatic optimization of furniture arrangement , 2011, ACM Trans. Graph..

[137]  Sumit Gulwani,et al.  Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples , 2018, ICLR.

[138]  Levent Burak Kara,et al.  Semantic shape editing using deformation handles , 2015, ACM Trans. Graph..

[139]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[140]  Leonidas J. Guibas,et al.  StructureNet , 2019, ACM Trans. Graph..

[141]  Lin Gao SDM-NET : Deep Generative Network for Structured Deformable Mesh , 2019 .

[142]  Ariel Shamir,et al.  Filling Your Shelves: Synthesizing Diverse Style-Preserving Artifact Arrangements , 2014, IEEE Transactions on Visualization and Computer Graphics.

[143]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[144]  Taku Komura,et al.  Relationship templates for creating scene variations , 2016, ACM Trans. Graph..

[145]  Jun Li,et al.  Symmetry Hierarchy of Man‐Made Objects , 2011, Comput. Graph. Forum.

[146]  Dawn Song,et al.  Execution-Guided Neural Program Synthesis , 2018, ICLR.

[147]  Leonidas J. Guibas,et al.  StructEdit: Learning Structural Shape Variations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[148]  Siddhartha Chaudhuri,et al.  SCORES: Shape Composition with Recursive Substructure Priors , 2018, ACM Trans. Graph..

[149]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[150]  Lin Gao,et al.  Variational Autoencoders for Deforming 3D Mesh Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.