Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis

We introduce DMTET, a deep 3D conditional generative model that can synthesize high-resolution 3D shapes using simple user guides such as coarse voxels. It marries the merits of implicit and explicit 3D representations by leveraging a novel hybrid 3D representation. Compared to the current implicit approaches, which are trained to regress the signed distance values, DMTET directly optimizes for the reconstructed surface, which enables us to synthesize finer geometric details with fewer artifacts. Unlike deep 3D generative models that directly generate explicit representations such as meshes, our model can synthesize shapes with arbitrary topology. The core of DMTET includes a deformable tetrahedral grid that encodes a discretized signed distance function and a differentiable marching tetrahedra layer that converts the implicit signed distance representation to the explicit surface mesh representation. This combination allows joint optimization of the surface geometry and topology as well as generation of the hierarchy of subdivisions using reconstruction and adversarial losses defined explicitly on the surface mesh. Our approach significantly outperforms existing work on conditional shape synthesis from coarse voxel inputs, trained on a dataset of complex 3D animal shapes. Project page: https://nv-tlabs.github.io/DMTet/.

[1]  Vladimir G. Kim,et al.  Neural subdivision , 2020, ACM Trans. Graph..

[2]  Robert Bridson,et al.  Isosurface stuffing improved: acute lattices and feature matching , 2013, SIGGRAPH '13.

[3]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[5]  Charles T. Loop,et al.  Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yaron Lipman,et al.  SAL: Sign Agnostic Learning of Shapes From Raw Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Leonidas J. Guibas,et al.  ComplementMe , 2017, ACM Trans. Graph..

[10]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[11]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[12]  Sanja Fidler,et al.  Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kui Jia,et al.  Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks , 2020, ICML.

[14]  Hao Zhang,et al.  D2IM-Net: Learning Detail Disentangled Implicit Fields from Single Images , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Leonidas J. Guibas,et al.  DeepSpline: Data-Driven Reconstruction of Parametric Curves and Surfaces , 2019, ArXiv.

[16]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[17]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[18]  Charles T. Loop,et al.  Smooth Subdivision Surfaces Based on Triangles , 1987 .

[19]  Duygu Ceylan,et al.  DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction , 2019, NeurIPS.

[20]  Yiyi Liao,et al.  Deep Marching Cubes: Learning Explicit Surface Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Song Han,et al.  Point-Voxel CNN for Efficient 3D Deep Learning , 2019, NeurIPS.

[23]  Siddhartha Chaudhuri,et al.  SCORES: Shape Composition with Recursive Substructure Priors , 2018, ACM Trans. Graph..

[24]  Gerard Pons-Moll,et al.  Neural Unsigned Distance Fields for Implicit Function Learning , 2020, NeurIPS.

[25]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Ming Ouhyoung,et al.  On Visual Similarity Based 3D Model Retrieval , 2003, Comput. Graph. Forum.

[28]  Matthias Nießner,et al.  Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[30]  Hao Li,et al.  PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Xin Tong,et al.  Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Lin Gao SDM-NET : Deep Generative Network for Structured Deformable Mesh , 2019 .

[33]  Jitendra Malik,et al.  Hierarchical Surface Prediction for 3D Object Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[34]  Pascal Fua,et al.  MeshSDF: Differentiable Iso-Surface Extraction , 2020, NeurIPS.

[35]  Akio Koide,et al.  An Efficient Method of Triangulating Equi-Valued Surfaces by Using Tetrahedral Cells , 1991 .

[36]  Marc Pollefeys,et al.  Convolutional Occupancy Networks , 2020, ECCV.

[37]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[38]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  L. Guibas,et al.  Curriculum DeepSDF , 2020, ECCV.

[40]  Hao Zhang,et al.  BSP-Net: Generating Compact Meshes via Binary Space Partitioning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Sanja Fidler,et al.  Learning Deformable Tetrahedral Meshes for 3D Reconstruction , 2020, NeurIPS.

[42]  Vladimir G. Kim,et al.  COALESCE: Component Assembly by Learning to Synthesize Connections , 2020, 2020 International Conference on 3D Vision (3DV).

[43]  Leonidas J. Guibas,et al.  Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[45]  Denis Zorin,et al.  Neural Splines: Fitting 3D Surfaces with Infinitely-Wide Neural Networks , 2020, ArXiv.

[46]  Sanja Fidler,et al.  Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid , 2020, ECCV.

[47]  Matthias Nießner,et al.  ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Alec Jacobson,et al.  Overfit Neural Networks as a Compact Shape Representation , 2020, ArXiv.

[49]  S. M. Ali Eslami,et al.  PolyGen: An Autoregressive Generative Model of 3D Meshes , 2020, ICML.

[50]  Hao Zhang,et al.  DECOR-GAN: 3D Shape Detailization by Conditional Refinement , 2020, Computer Vision and Pattern Recognition.

[51]  Yang Liu,et al.  Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[52]  Jitendra Malik,et al.  Mesh R-CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[53]  Sanja Fidler,et al.  Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research , 2019, ArXiv.

[54]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Daniel Cohen-Or,et al.  Point2Mesh , 2020, ACM Trans. Graph..

[56]  Jaakko Lehtinen,et al.  Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer , 2019, NeurIPS.

[57]  Frank Weichert,et al.  Adversarial Generation of Continuous Implicit Shape Representations , 2020, Eurographics.

[58]  Hanbyul Joo,et al.  PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Sanja Fidler,et al.  Fast Interactive Object Annotation With Curve-GCN , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Noah Snavely,et al.  DualSDF: Semantic Shape Manipulation Using a Two-Level Representation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Mathieu Aubry,et al.  A Papier-Mache Approach to Learning 3D Surface Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.