PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions

3D generative shape modeling is a fundamental research area in computer vision and interactive computer graphics, with many real-world applications. This paper investigates the novel problem of generating 3D shape point cloud geometry from a symbolic part tree representation. In order to learn such a conditional shape generation procedure in an end-to-end fashion, we propose a conditional GAN "part tree"-to-"point cloud" model (PT2PC) that disentangles the structural and geometric factors. The proposed model incorporates the part tree condition into the architecture design by passing messages top-down and bottom-up along the part tree hierarchy. Experimental results and user study demonstrate the strengths of our method in generating perceptually plausible and diverse 3D point clouds, given the part tree condition. We also propose a novel structural measure for evaluating if the generated shape point clouds satisfy the part tree conditions.

[1]  Vladimir G. Kim,et al.  Data‐Driven Shape Analysis and Processing , 2015, Comput. Graph. Forum.

[2]  Yang Zhang,et al.  Point Cloud GAN , 2018, DGS@ICLR.

[3]  Nenghai Yu,et al.  Semantics Disentangling for Text-To-Image Generation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Alexei A. Efros,et al.  Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics , 2010, ECCV.

[5]  Jitendra Malik,et al.  The three R's of computer vision: Recognition, reconstruction and reorganization , 2016, Pattern Recognit. Lett..

[6]  Li Fei-Fei,et al.  Image Generation from Scene Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Leonidas J. Guibas,et al.  A scalable active framework for region annotation in 3D shape collections , 2016, ACM Trans. Graph..

[8]  Levent Burak Kara,et al.  Semantic shape editing using deformation handles , 2015, ACM Trans. Graph..

[9]  Leonidas J. Guibas,et al.  StructEdit: Learning Structural Shape Variations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Kai Xu,et al.  Learning Part Generation and Assembly for Structure-aware Shape Synthesis , 2019, AAAI.

[11]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  J. Tenenbaum,et al.  MarrNet : 3 D Shape Reconstruction via 2 . 5 D Sketches , 2017 .

[13]  Thomas A. Funkhouser,et al.  Shape-based retrieval and analysis of 3d models , 2005, CACM.

[14]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[15]  Jinwoo Shin,et al.  InstaGAN: Instance-aware Image-to-Image Translation , 2018, ICLR.

[16]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Zhen Li,et al.  High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Jiajun Wu,et al.  Learning to Infer and Execute 3D Shape Programs , 2019, ICLR.

[19]  Leonidas J. Guibas,et al.  ComplementMe , 2017, ACM Trans. Graph..

[20]  Evangelos Kalogerakis,et al.  Eurographics Symposium on Geometry Processing 2015 Analysis and Synthesis of 3d Shape Families via Deep-learned Generative Models of Surfaces , 2022 .

[21]  Jitendra Malik,et al.  Mesh R-CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Cordelia Schmid,et al.  BodyNet: Volumetric Inference of 3D Human Body Shapes , 2018, ECCV.

[23]  Enrico Magli,et al.  Learning Localized Generative Models for 3D Point Clouds via Graph Convolution , 2018, ICLR.

[24]  Junseok Kwon,et al.  3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Hao Li,et al.  Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Lior Wolf,et al.  Specifying Object Attributes and Relations in Interactive Scene Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[30]  Angel X. Chang,et al.  Learning Spatial Knowledge for Text to 3D Scene Generation , 2014, EMNLP.

[31]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[32]  Leonidas J. Guibas,et al.  Parsing Geometry Using Structure-Aware Shape Templates , 2018, 2018 International Conference on 3D Vision (3DV).

[33]  Aykut Erdem,et al.  Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts , 2016, ArXiv.

[34]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[35]  Dani Lischinski,et al.  SAGNet , 2018, ACM Trans. Graph..

[36]  Jitendra Malik,et al.  Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[38]  Daniel Cohen-Or,et al.  Structure-aware shape processing , 2013, Eurographics.

[39]  安藤 広志,et al.  20世紀の名著名論:David Marr:Vision:a Computational Investigation into the Human Representation and Processing of Visual Information , 2005 .

[40]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[41]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[42]  Federico Tombari,et al.  3D Point Capsule Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Leonidas J. Guibas,et al.  PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[45]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[46]  Jun Li,et al.  Symmetry Hierarchy of Man‐Made Objects , 2011, Comput. Graph. Forum.

[47]  Thomas A. Funkhouser,et al.  A benchmark for 3D mesh segmentation , 2009, ACM Trans. Graph..

[48]  Christopher Potts,et al.  Text to 3D Scene Generation with Rich Lexical Grounding , 2015, ACL.

[49]  Andrea Tagliasacchi,et al.  CvxNet: Learnable Convex Decomposition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Ersin Yumer,et al.  Self-supervised Learning of Motion Capture , 2017, NIPS.

[51]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[52]  Szymon Rusinkiewicz,et al.  Modeling by example , 2004, ACM Trans. Graph..

[53]  Kun Liu,et al.  PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Subhransu Maji,et al.  3D Shape Segmentation with Projective Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Thomas A. Funkhouser,et al.  Learning Shape Templates With Structured Implicit Functions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[56]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[57]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[59]  Danfei Xu,et al.  Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Xiaojuan Qi,et al.  GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction , 2018, ECCV.

[61]  Leonidas J. Guibas,et al.  StructureNet , 2019, ACM Trans. Graph..

[62]  Yan Zhang,et al.  3D shape segmentation via shape fully convolutional networks , 2017, Comput. Graph..

[63]  Hao Zhang,et al.  PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Andreas Geiger,et al.  Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[66]  Leonidas J. Guibas,et al.  Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[68]  Lin Gao,et al.  SDM-NET , 2019, ACM Trans. Graph..

[69]  Daniel Cohen-Or,et al.  Co-hierarchical analysis of shape structures , 2013, ACM Trans. Graph..

[70]  Michael J. Black,et al.  Three-D Safari: Learning to Estimate Zebra Pose, Shape, and Texture From Images “In the Wild” , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[71]  M. Kilian,et al.  Geometric modeling in shape space , 2007, SIGGRAPH 2007.

[72]  Patrick Pérez,et al.  MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[73]  Stephen DiVerdi,et al.  Learning part-based templates from large collections of 3D shapes , 2013, ACM Trans. Graph..

[74]  Qian-Fang Zou,et al.  Learning adaptive hierarchical cuboid abstractions of 3D shape collections , 2019, ACM Trans. Graph..

[75]  Leonidas J. Guibas,et al.  Learning hierarchical shape segmentation and labeling from online repositories , 2017, ACM Trans. Graph..

[76]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[77]  Karthik Ramani,et al.  SurfNet: Generating 3D Shape Surfaces Using Deep Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Seunghoon Hong,et al.  Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[79]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[80]  Leonidas J. Guibas,et al.  SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Seunghoon Hong,et al.  Learning Hierarchical Semantic Image Manipulation through Structured Representations , 2018, NeurIPS.

[82]  Yiyi Liao,et al.  Deep Marching Cubes: Learning Explicit Surface Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[83]  Lei Zhang,et al.  Object-Driven Text-To-Image Synthesis via Adversarial Training , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[85]  Satoshi Matsuoka,et al.  Teddy: A Sketching Interface for 3D Freeform Design , 1999, SIGGRAPH Courses.

[86]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[87]  Ersin Yumer,et al.  3D-PRNN: Generating Shape Primitives with Recurrent Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[88]  Daniel Cohen-Or,et al.  Global-to-local generative model for 3D shapes , 2018, ACM Trans. Graph..

[89]  Daniel Cohen-Or,et al.  CompoNet: Learning to Generate the Unseen by Part Synthesis and Composition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[90]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[91]  Daniel Cohen-Or,et al.  Meta-representation of shape families , 2014, ACM Trans. Graph..

[92]  Abhinav Gupta,et al.  From Images to 3D Shape Attributes , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[93]  Hao Su,et al.  Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks , 2020, ArXiv.

[94]  Ming-Yu Liu,et al.  PointFlow: 3D Point Cloud Generation With Continuous Normalizing Flows , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[95]  Aaron Hertzmann,et al.  Learning 3D mesh segmentation and labeling , 2010, ACM Trans. Graph..

[96]  Jonathan Tompson,et al.  Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning , 2018, NeurIPS.

[97]  Subhransu Maji,et al.  Multiresolution Tree Networks for 3D Point Cloud Processing , 2018, ECCV.

[98]  Jun Li,et al.  Im2Struct: Recovering 3D Shape Structure from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[99]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[100]  Leonidas J. Guibas,et al.  Discovering structural regularity in 3D geometry , 2008, ACM Trans. Graph..

[101]  Vicente Ordonez,et al.  Text2Scene: Generating Compositional Scenes From Textual Descriptions , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[102]  Bo Zhao,et al.  Image Generation From Layout , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[103]  Leonidas J. Guibas,et al.  GRASS: Generative Recursive Autoencoders for Shape Structures , 2017, ACM Trans. Graph..

[104]  Siddhartha Chaudhuri,et al.  A probabilistic model for component-based shape synthesis , 2012, ACM Trans. Graph..

[105]  Silvio Savarese,et al.  Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings , 2018, ACCV.