The shape variational autoencoder: A deep generative model of part‐segmented 3D objects

We introduce a generative model of part‐segmented 3D objects: the shape variational auto‐encoder (ShapeVAE). The ShapeVAE describes a joint distribution over the existence of object parts, the locations of a dense set of surface points, and over surface normals associated with these points. Our model makes use of a deep encoder‐decoder architecture that leverages the part‐decomposability of 3D objects to embed high‐dimensional shape representations and sample novel instances. Given an input collection of part‐segmented objects with dense point correspondences the ShapeVAE is capable of synthesizing novel, realistic shapes, and by performing conditional inference enables imputation of missing parts or surface normals. In addition, by generating both points and surface normals, our model allows for the use of powerful surface‐reconstruction methods for mesh synthesis. We provide a quantitative evaluation of the ShapeVAE on shape‐completion and test‐set log‐likelihood tasks and demonstrate that the model performs favourably against strong baselines. We demonstrate qualitatively that the ShapeVAE produces plausible shape samples, and that it captures a semantically meaningful shape‐embedding. In addition we show that the ShapeVAE facilitates mesh reconstruction by sampling consistent surface normals.

[1]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[4]  James F. O'Brien,et al.  Spectral surface reconstruction from noisy point clouds , 2004, SGP '04.

[5]  Silvio Savarese,et al.  Enriching object detection with 2D-3D registration and continuous viewpoint estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Les A. Piegl,et al.  The NURBS Book , 1995, Monographs in Visual Communication.

[7]  Levent Burak Kara,et al.  Procedural Modeling Using Autoencoder Networks , 2015, UIST.

[8]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[9]  Timothy F. Cootes,et al.  Automatic Interpretation and Coding of Face Images Using Flexible Models , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Jean-Daniel Boissonnat,et al.  Geometric structures for three-dimensional shape representation , 1984, TOGS.

[11]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[12]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[13]  Michael J. Black,et al.  The stitched puppet: A graphical model of 3D human shape and pose , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yutaka Ohtake,et al.  Smoothing of Partition of Unity Implicit Surfaces for Noise Robust Surface Reconstruction , 2009, Comput. Graph. Forum.

[15]  Geoffrey E. Hinton,et al.  Deep Mixtures of Factor Analysers , 2012, ICML.

[16]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[18]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Hans-Peter Meinzer,et al.  Statistical shape models for 3D medical image segmentation: A review , 2009, Medical Image Anal..

[20]  Siddhartha Chaudhuri,et al.  A probabilistic model for component-based shape synthesis , 2012, ACM Trans. Graph..

[21]  Bernt Schiele,et al.  Detailed 3D Representations for Object Recognition and Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tony DeRose,et al.  Piecewise smooth surface reconstruction , 1994, SIGGRAPH.

[23]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[24]  Leonidas J. Guibas,et al.  Estimating image depth using shape collections , 2014, ACM Trans. Graph..

[25]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[26]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[27]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[28]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[29]  Niloy J. Mitra,et al.  ShapeSynth: Parameterizing model collections for coupled shape exploration and synthesis , 2014, Comput. Graph. Forum.

[30]  Daniel Cohen-Or,et al.  Meta-representation of shape families , 2014, ACM Trans. Graph..

[31]  David G. Kirkpatrick,et al.  On the shape of a set of points in the plane , 1983, IEEE Trans. Inf. Theory.

[32]  Evangelos Kalogerakis,et al.  Eurographics Symposium on Geometry Processing 2015 Analysis and Synthesis of 3d Shape Families via Deep-learned Generative Models of Surfaces , 2022 .

[33]  Nicolas Le Roux,et al.  Learning a Generative Model of Images by Factoring Appearance and Shape , 2011, Neural Computation.

[34]  Christopher K. I. Williams,et al.  The Shape Boltzmann Machine: A Strong Model of Object Shape , 2012, International Journal of Computer Vision.

[35]  Daniel Cohen-Or,et al.  Smart Variations: Functional Substructures for Part Compatibility , 2013, Comput. Graph. Forum.

[36]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[37]  Cordelia Schmid,et al.  Multi-view object class detection with a 3D geometric model , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Stephen DiVerdi,et al.  Learning part-based templates from large collections of 3D shapes , 2013, ACM Trans. Graph..

[39]  Aaron C. Courville,et al.  Discriminative Regularization for Generative Models , 2016, ArXiv.

[40]  Sinisa Todorovic,et al.  From contours to 3D object detection and pose estimation , 2011, 2011 International Conference on Computer Vision.

[41]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[42]  Gabriel Taubin,et al.  The ball-pivoting algorithm for surface reconstruction , 1999, IEEE Transactions on Visualization and Computer Graphics.

[43]  Chandrajit L. Bajaj,et al.  Automatic reconstruction of surfaces and scalar fields from 3D scans , 1995, SIGGRAPH.

[44]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[45]  Tony DeRose,et al.  Surface reconstruction from unorganized points , 1992, SIGGRAPH.

[46]  Herbert Edelsbrunner,et al.  Three-dimensional alpha shapes , 1994, ACM Trans. Graph..

[47]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[49]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[50]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[51]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[52]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.