Occupancy Networks: Learning 3D Reconstruction in Function Space

With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

[1]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Roberto Cipolla,et al.  Structure and motion from silhouettes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Jitendra Malik,et al.  Learning a Multi-View Stereo Machine , 2017, NIPS.

[4]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[5]  Yiyi Liao,et al.  Deep Marching Cubes: Learning Explicit Surface Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Jitendra Malik,et al.  Learning Category-Specific Mesh Reconstruction from Image Collections , 2018, ECCV.

[7]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[8]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[9]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[11]  Harris Drucker,et al.  Improving generalization performance using double backpropagation , 1992, IEEE Trans. Neural Networks.

[12]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jiajun Wu,et al.  Learning to Reconstruct Shapes from Unseen Classes , 2018, NeurIPS.

[15]  W. Marsden I and J , 2012 .

[16]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  Chen Kong,et al.  Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Lu Fang,et al.  SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[22]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[24]  Daniel Cremers,et al.  Shedding light on stereoscopic segmentation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[25]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[26]  Marc Pollefeys,et al.  From Point Clouds to Mesh Using Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Jitendra Malik,et al.  Hierarchical Surface Prediction for 3D Object Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[28]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  J. Tenenbaum,et al.  MarrNet : 3 D Shape Reconstruction via 2 . 5 D Sketches , 2017 .

[30]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[31]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[32]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[33]  Daniel Cremers,et al.  Robust Variational Segmentation of 3D Objects from Multiple Views , 2006, DAGM-Symposium.

[34]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[35]  Jiajun Wu,et al.  Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Olivier D. Faugeras,et al.  Level Set Methods and the Stereo Problem , 1997, Scale-Space.

[37]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[38]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[39]  Theodore Lim,et al.  Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[40]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[41]  Chris L. Jackins,et al.  Oct-trees and their use in representing three-dimensional objects , 1980 .

[42]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  J. Sethian,et al.  Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulations , 1988 .

[44]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[45]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[47]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[48]  Yee Whye Teh,et al.  Neural Processes , 2018, ArXiv.

[49]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Max Jaderberg,et al.  Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[51]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Jiajun Wu,et al.  Learning Shape Priors for Single-View 3D Completion and Reconstruction , 2018, ECCV.

[53]  A. Dervieux,et al.  A finite element method for the simulation of a Rayleigh-Taylor instability , 1980 .

[54]  Nikos Paragios,et al.  AtlasNet: Multi-atlas Non-linear Deep Networks for Medical Image Segmentation , 2018, MICCAI.

[55]  Michael Garland,et al.  Simplifying surfaces with color and texture using quadric error metrics , 1998, IEEE Visualization.

[56]  Anders P. Eriksson,et al.  Deep Level Sets: Implicit Surface Representations for 3D Shape Inference , 2019, ArXiv.

[57]  Marcus A. Magnor,et al.  Space-time isosurface evolution for temporally coherent 3D reconstruction , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[58]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[59]  Olivier D. Faugeras,et al.  Modelling dynamic scenes by registering multi-view image sequences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[60]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[61]  Gabriel Taubin,et al.  SSD: Smooth Signed Distance Surface Reconstruction , 2011, Comput. Graph. Forum.

[62]  Luc Van Gool,et al.  RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[63]  Ozan ARSLAN 3d Object Reconstruction from a Single Image. , 2014 .

[64]  Michael J. Black,et al.  Towards Probabilistic Volumetric Reconstruction Using Ray Potentials , 2015, 2015 International Conference on 3D Vision.

[65]  Horst Bischof,et al.  OctNetFusion: Learning Depth Fusion from Data , 2017, 2017 International Conference on 3D Vision (3DV).

[66]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[67]  J. Sethian,et al.  FRONTS PROPAGATING WITH CURVATURE DEPENDENT SPEED: ALGORITHMS BASED ON HAMILTON-JACOB1 FORMULATIONS , 2003 .

[68]  Xiaowu Chen,et al.  3D Mesh Labeling via Deep Convolutional Neural Networks , 2015, ACM Trans. Graph..

[69]  Rachid Deriche,et al.  A Review of Statistical Approaches to Level Set Segmentation: Integrating Color, Texture, Motion and Shape , 2007, International Journal of Computer Vision.

[70]  Michael J. Black,et al.  Generating 3D faces using Convolutional Mesh Autoencoders , 2018, ECCV.

[71]  Richard Szeliski,et al.  Rapid octree construction from image sequences , 1993 .

[72]  Gabriel Taubin,et al.  The ball-pivoting algorithm for surface reconstruction , 1999, IEEE Transactions on Visualization and Computer Graphics.

[73]  Jianxiong Xiao,et al.  Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Donald Meagher,et al.  Geometric modeling using octree encoding , 1982, Comput. Graph. Image Process..

[75]  Stefano Soatto,et al.  Stereoscopic Segmentation , 2001, ICCV.

[76]  Yan Zhang,et al.  3D shape segmentation via shape fully convolutional networks , 2017, Comput. Graph..

[77]  Subhransu Maji,et al.  3D Shape Induction from 2D Views of Multiple Objects , 2016, 2017 International Conference on 3D Vision (3DV).

[78]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[79]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[80]  Andreas Geiger,et al.  Learning 3D Shape Completion from Laser Scan Data with Weak Supervision , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[81]  Matthias Nießner,et al.  Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  David Meger,et al.  Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation , 2018, NeurIPS.