LSD-StructureNet: Modeling Levels of Structural Detail in 3D Part Hierarchies

Generative models for 3D shapes represented by hierarchies of parts can generate realistic and diverse sets of outputs. However, existing models suffer from the key practical limitation of modelling shapes holistically and thus cannot perform conditional sampling, i.e. they are not able to generate variants on individual parts of generated shapes without modifying the rest of the shape. This is limiting for applications such as 3D CAD design that involve adjusting created shapes at multiple levels of detail. To address this, we introduce LSD-StructureNet, an augmentation to the StructureNet architecture that enables re-generation of parts situated at arbitrary positions in the hierarchies of its outputs. We achieve this by learning individual, probabilistic conditional decoders for each hierarchy depth. We evaluate LSD-StructureNet on the PartNet dataset, the largest dataset of 3D shapes represented by hierarchies of parts. Our results show that contrarily to existing methods, LSDStructureNet can perform conditional sampling without impacting inference speed or the realism and diversity of its outputs.

[1]  Subhransu Maji,et al.  Learning Generative Models of Shape Handles , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[3]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  L. Guibas,et al.  DSM-Net: Disentangled Structured Mesh Net for Controllable Generation of Fine Geometry , 2020, ArXiv.

[5]  Lin Gao,et al.  TM-NET , 2020, ACM Trans. Graph..

[6]  Siddhartha Chaudhuri,et al.  A probabilistic model for component-based shape synthesis , 2012, ACM Trans. Graph..

[7]  Niloy J. Mitra,et al.  ShapeAssembly , 2020, ACM Trans. Graph..

[8]  S. M. Ali Eslami,et al.  PolyGen: An Autoregressive Generative Model of 3D Meshes , 2020, ICML.

[9]  Daniel Cohen-Or,et al.  CompoNet: Learning to Generate the Unseen by Part Synthesis and Composition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Yoshua Bengio,et al.  Z-Forcing: Training Stochastic Recurrent Networks , 2017, NIPS.

[11]  Leonidas J. Guibas,et al.  PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[13]  Dani Lischinski,et al.  SAGNet , 2018, ACM Trans. Graph..

[14]  Junseok Kwon,et al.  3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Enrico Magli,et al.  Learning Localized Generative Models for 3D Point Clouds via Graph Convolution , 2018, ICLR.

[16]  Jan Kautz,et al.  NVAE: A Deep Hierarchical Variational Autoencoder , 2020, NeurIPS.

[17]  Hao Zhang,et al.  PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Leonidas J. Guibas,et al.  StructureNet , 2019, ACM Trans. Graph..

[19]  Alexander Schwing,et al.  Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Christopher K. I. Williams,et al.  The shape variational autoencoder: A deep generative model of part‐segmented 3D objects , 2017, Comput. Graph. Forum.

[21]  Siddhartha Chaudhuri,et al.  SCORES: Shape Composition with Recursive Substructure Priors , 2018, ACM Trans. Graph..

[22]  Leonidas J. Guibas,et al.  Composite Shape Modeling via Latent Space Factorization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Jiajun Wu,et al.  Learning to Infer and Execute 3D Shape Programs , 2019, ICLR.

[26]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[27]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[28]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Dhruv Batra,et al.  Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Yoshua Bengio,et al.  A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.

[32]  Leonidas J. Guibas,et al.  GRASS: Generative Recursive Autoencoders for Shape Structures , 2017, ACM Trans. Graph..