3D Reconstruction of Novel Object Shapes from Single Images

Accurately predicting the 3D shape of any arbitrary object in any pose from a single image is a key goal of computer vision research. This is challenging as it requires a model to learn a representation that can infer both the visible and occluded portions of any object using a limited training set. A training set that covers all possible object shapes is inherently infeasible. Such learning-based approaches are inherently vulnerable to overfitting, and successfully implementing them is a function of both the architecture design and the training approach. We present an extensive investigation of factors specific to architecture design, training, experiment design, and evaluation that influence reconstruction performance and measurement. We show that our proposed SDFNet achieves state-of-the-art performance on seen and unseen shapes relative to existing methods GenRe [53] and OccNet [29]. We provide the first large-scale evaluation of single image shape reconstruction to unseen objects. The source code, data, and trained models can be found on https://github.com/rehg-lab/3DShapeGen.

[1]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[2]  M. Pollefeys,et al.  DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Derek Hoiem,et al.  Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  J. Tenenbaum,et al.  MarrNet : 3 D Shape Reconstruction via 2 . 5 D Sketches , 2017 .

[5]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[6]  Frédéric Maire,et al.  Learning Free-Form Deformations for 3D Object Reconstruction , 2018, ACCV.

[7]  Thomas Brox,et al.  Multi-view 3D Models from Single Images with a Convolutional Network , 2015, ECCV.

[8]  Jörg Stückler,et al.  Semi-Supervised Deep Learning for Monocular Depth Map Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ayoub Al-Hamadi,et al.  Truncated Signed Distance Function: Experiments on Voxel Size , 2014, ICIAR.

[11]  Vittorio Ferrari,et al.  Learning Single-Image 3D Reconstruction by Generative Modelling of Shape, Pose and Shading , 2019, International Journal of Computer Vision.

[12]  Jason J. Corso,et al.  Novel Object Viewpoint Estimation Through Reconstruction Alignment , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Zhaoyuan Fang,et al.  GSIR: Generalizable 3D Shape Interpretation and Reconstruction , 2020, ECCV.

[14]  Andreas Geiger,et al.  Learning 3D Shape Completion Under Weak Supervision , 2018, International Journal of Computer Vision.

[15]  Katsushi Ikeuchi,et al.  Numerical Shape from Shading and Occluding Boundaries , 1981, Artif. Intell..

[16]  Renjie Liao,et al.  GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Marc Alexa,et al.  ABC: A Big CAD Model Dataset for Geometric Deep Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Weifeng Chen,et al.  Single-Image Depth Perception in the Wild , 2016, NIPS.

[19]  Jiajun Wu,et al.  Learning to Reconstruct Shapes from Unseen Classes , 2018, NeurIPS.

[20]  Hao Li,et al.  Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[22]  Subhransu Maji,et al.  Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Thomas Brox,et al.  What Do Single-View 3D Reconstruction Networks Learn? , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Kalyan Sunkavalli,et al.  Learning to reconstruct shape and spatially-varying reflectance from a single image , 2018, ACM Trans. Graph..

[26]  Simon J. Julier,et al.  Structured Prediction of Unobserved Voxels from a Single Depth Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Leonidas J. Guibas,et al.  FrameNet: Learning Local Canonical Frames of 3D Surfaces From a Single RGB Image , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Eddy Ilg,et al.  Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction , 2020, ECCV.

[29]  Edward H. Adelson,et al.  Recovering intrinsic images from a single image , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[31]  Bernard Ghanem,et al.  Leveraging Shape Completion for 3D Siamese Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Zhengqi Li,et al.  CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering , 2018, ECCV.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Nitish Srivastava,et al.  On the generalization of learning-based 3D reconstruction , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[35]  Hao Li,et al.  Learning to Infer Implicit Surfaces without 3D Supervision , 2019, NeurIPS.

[36]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[37]  Duygu Ceylan,et al.  DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction , 2019, NeurIPS.

[38]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[39]  Thomas Brox,et al.  CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Yongtian Wang,et al.  Deep Surface Normal Estimation With Hierarchical RGB-D Fusion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Bo Yang,et al.  3D Object Reconstruction from a Single Depth View with Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[42]  Derek Hoiem,et al.  Completing 3D object shape from one depth image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Zhengqi Li,et al.  MegaDepth: Learning Single-View Depth Prediction from Internet Photos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Theo Gevers,et al.  CNN Based Learning Using Reflection and Retinex Models for Intrinsic Image Decomposition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  David Meger,et al.  GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects , 2019, ICML.

[46]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[47]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Frank Weichert,et al.  Adversarial Generation of Continuous Implicit Shape Representations , 2020, Eurographics.

[50]  Hao Zhang,et al.  Learning Implicit Fields for Generative Shape Modeling , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[52]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).