MRGAN: Multi-Rooted 3D Shape Representation Learning with Unsupervised Part Disentanglement

We introduce MRGAN, multi-rooted GAN, the first generative adversarial network to learn a part-disentangled 3D shape representation without any part supervision. The network fuses multiple branches of tree-structured graph convolution layers which produce point clouds in a controllable manner. Specifically, each branch learns to grow a different shape part, offering control over the shape generation at the part level. Our network encourages disentangled generation of semantic parts via two key ingredients: a root-mixing training strategy which helps decorrelate the different branches to facilitate disentanglement, and a set of loss terms designed with part disentanglement and shape semantics in mind. Of these, a novel convexity loss incentivizes the generation of parts that are more convex, as semantic parts tend to be. In addition, a root-dropping loss further ensures that each root seeds a single part, preventing the degeneration or over-growth of the point-producing branches. We evaluate the performance of our network on a number of 3D shape classes, and offer qualitative and quantitative comparisons to previous works and baseline approaches. We demonstrate the controllability offered by our part-disentangled representation through two applications for shape modeling: part mixing and individual part variation, without receiving segmented shapes as input.

[1]  Ming-Yu Liu,et al.  PointFlow: 3D Point Cloud Generation With Continuous Normalizing Flows , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Daniel Cohen-Or,et al.  Shape Segmentation by Approximate Convexity Analysis , 2014, ACM Trans. Graph..

[3]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[4]  Lin Gao SDM-NET : Deep Generative Network for Structured Deformable Mesh , 2019 .

[5]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[6]  Andreas Geiger,et al.  Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Bo Yang,et al.  Dense 3D Object Reconstruction from a Single Depth View , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Daniel Cohen-Or,et al.  CompoNet: Learning to Generate the Unseen by Part Synthesis and Composition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Leonidas J. Guibas,et al.  StructureNet , 2019, ACM Trans. Graph..

[10]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[11]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[12]  Andriy Mnih,et al.  Disentangling by Factorising , 2018, ICML.

[13]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Xianzhi Li,et al.  SP-GAN: Sphere-Guided 3D Shape Generation and Manipulation , 2021, ArXiv.

[15]  Leonidas J. Guibas,et al.  GRASS: Generative Recursive Autoencoders for Shape Structures , 2017, ACM Trans. Graph..

[16]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[19]  Leonidas J. Guibas,et al.  Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  He Wang,et al.  PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions , 2020, ECCV.

[22]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[23]  Eric P. Xing,et al.  Generative Semantic Manipulation with Mask-Contrasting GAN , 2018, ECCV.

[24]  Kai Xu,et al.  Learning Generative Models of 3D Structures , 2020, Eurographics.

[25]  Yong-Liang Yang,et al.  HoloGAN: Unsupervised Learning of 3D Representations From Natural Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[26]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[27]  Daniel Cohen-Or,et al.  Global-to-local generative model for 3D shapes , 2018, ACM Trans. Graph..

[28]  Siddhartha Chaudhuri,et al.  BAE-NET: Branched Autoencoder for Shape Co-Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Hao Zhang,et al.  BSP-Net: Generating Compact Meshes via Binary Space Partitioning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Junseok Kwon,et al.  3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[32]  Leonidas J. Guibas,et al.  Composite Shape Modeling via Latent Space Factorization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Andrea Tagliasacchi,et al.  CvxNet: Learnable Convex Decomposition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Chongyang Ma,et al.  Deep Generative Modeling for Scene Synthesis via Hybrid Representations , 2018, ACM Trans. Graph..

[36]  Enrico Magli,et al.  Learning Localized Generative Models for 3D Point Clouds via Graph Convolution , 2018, ICLR.

[37]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[38]  Mathieu Aubry,et al.  Learning elementary structures for 3D shape generation and matching , 2019, NeurIPS.

[39]  Yong-Liang Yang,et al.  BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images , 2020, NeurIPS.

[40]  Subhransu Maji,et al.  Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions , 2020, ECCV.

[41]  Daniel Cohen-Or,et al.  LOGAN , 2019, ACM Trans. Graph..

[42]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[43]  Daniel Cohen-Or,et al.  Unsupervised multi-modal Styled Content Generation , 2020 .

[44]  Kai Xu,et al.  Learning Part Generation and Assembly for Structure-aware Shape Synthesis , 2019, AAAI.

[45]  Kun Liu,et al.  PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Fisher Yu,et al.  Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).