GT-Net: Variational Autoencoder Networks based on Graph Transformer for 3D Shape Learning

In this paper, we introduce a novel structure-aware method to generate diverse, and realistic 3D shapes via semantic parts. Most previous works neglect the structural and context information between shape parts and only consider the geometric information. This sometimes leads to the wrong combination of parts during the generation process and brought down the generation quality. To address this issue, we learn a structure-aware latent representation for 3D shapes by training a variational autoencoder(VAE). Specially, we use a graph to express semantic parts and their structural relationship of the 3D shape. Based on that graph representation, we design a generative network based on graph transformer architecture, called Graph Transformer VAE networks(GT-Net), to encode and decode the graph-represented 3D shape. Our experimental results demonstrate that our method achieves better performance than previous methods among various shape families, especially in terms of capturing shape details information.

[1]  Jian Yang,et al.  Pyramid Point Cloud Transformer for Large-Scale Place Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Di He,et al.  Do Transformers Really Perform Bad for Graph Representation? , 2021, ArXiv.

[3]  Xavier Bresson,et al.  A Generalization of Transformer Networks to Graphs , 2020, ArXiv.

[4]  William L. Hamilton Graph Representation Learning , 2020, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[5]  Hao Zhang,et al.  PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Leonidas J. Guibas,et al.  StructureNet , 2019, ACM Trans. Graph..

[7]  Rose Yu,et al.  Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology , 2019, NeurIPS.

[8]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[9]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Leonidas J. Guibas,et al.  PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Daniel Cohen-Or,et al.  Global-to-local generative model for 3D shapes , 2018, ACM Trans. Graph..

[12]  Yue Gao,et al.  MeshNet: Mesh Neural Network for 3D Shape Representation , 2018, AAAI.

[13]  Daniel Cohen-Or,et al.  CompoNet: Learning to Generate the Unseen by Part Synthesis and Composition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  L. Guibas,et al.  GRASS , 2017 .

[15]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[18]  Leonidas J. Guibas,et al.  GRASS: Generative Recursive Autoencoders for Shape Structures , 2017, ACM Trans. Graph..

[19]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Diederik P. Kingma,et al.  Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[21]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[22]  Evangelos Kalogerakis,et al.  Eurographics Symposium on Geometry Processing 2015 Analysis and Synthesis of 3d Shape Families via Deep-learned Generative Models of Surfaces , 2022 .