Image2Mesh: A Learning Framework for Single Image 3D Reconstruction

One challenge that remains open in 3D deep learning is how to efficiently represent 3D data to feed deep networks. Recent works have relied on volumetric or point cloud representations, but such approaches suffer from a number of issues such as computational complexity, unordered data, and lack of finer geometry. This paper demonstrates that a mesh representation (i.e. vertices and faces to form polygonal surfaces) is able to capture fine-grained geometry for 3D reconstruction tasks. A mesh however is also unstructured data similar to point clouds. We address this problem by proposing a learning framework to infer the parameters of a compact mesh representation rather than learning from the mesh itself. This compact representation encodes a mesh using free-form deformation and a sparse linear combination of models allowing us to reconstruct 3D meshes from single images. In contrast to prior work, we do not rely on silhouettes and landmarks to perform 3D reconstruction. We evaluate our method on synthetic and real-world datasets with very promising results. Our framework efficiently reconstructs 3D objects in a low-dimensional way while preserving its important geometrical aspects.

[1]  Chen Kong,et al.  Structure from Category: A Generic and Prior-Less Approach , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[2]  Karthik Ramani,et al.  SurfNet: Generating 3D Shape Surfaces Using Deep Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Simon Lucey,et al.  Rethinking Reprojection: Closing the Loop for Pose-Aware Shape Reconstruction from a Single Image , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Wen Gao,et al.  Robust Estimation of 3D Human Poses from a Single Image , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jitendra Malik,et al.  Learning a Multi-View Stereo Machine , 2017, NIPS.

[6]  Yuandong Tian,et al.  Single Image 3D Interpreter Network , 2016, ECCV.

[7]  Yiyi Liao,et al.  Deep Marching Cubes: Learning Explicit Surface Representations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Silvio Savarese,et al.  Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[9]  Peter Wonka,et al.  PolyFit: Polygonal Surface Reconstruction from Point Clouds , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Abhinav Gupta,et al.  Marr Revisited: 2D-3D Alignment via Surface Normal Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Thomas A. Funkhouser,et al.  Interactive 3D Modeling with a Generative Adversarial Network , 2017, 2017 International Conference on 3D Vision (3DV).

[13]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[15]  Chen Kong,et al.  Using Locally Corresponding CAD Models for Dense 3D Reconstructions from a Single Image , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xiaowei Zhou,et al.  Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Michael J. Black,et al.  Towards Probabilistic Volumetric Reconstruction Using Ray Potentials , 2015, 2015 International Conference on 3D Vision.

[18]  Jitendra Malik,et al.  Hierarchical Surface Prediction for 3D Object Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[19]  Marc Pollefeys,et al.  Multi-Label Semantic 3D Reconstruction Using Voxel Blocks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[20]  Xiaowei Zhou,et al.  3D Shape Reconstruction from 2D Landmarks: A Convex Formulation , 2014, ArXiv.

[21]  Thomas W. Sederberg,et al.  Free-form deformation of solid geometric models , 1986, SIGGRAPH.

[22]  Subhransu Maji,et al.  3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks , 2017, 2017 International Conference on 3D Vision (3DV).

[23]  Anders P. Eriksson,et al.  Compact Model Representation for 3D Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[24]  Subhransu Maji,et al.  3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks , 2017, 2017 International Conference on 3D Vision (3DV).

[25]  Derek Hoiem,et al.  Completing 3D object shape from one depth image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[27]  Xiao Tan,et al.  Single View 3D Reconstruction under an Uncalibrated Camera and an Unknown Mirror Sphere , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[28]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Gernot Riegler,et al.  OctNet: Learning Deep 3D Representations at High Resolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[31]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[32]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[33]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[35]  Jitendra Malik,et al.  Category-specific object reconstruction from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Yang Liu,et al.  O-CNN , 2017, ACM Trans. Graph..

[37]  Chen Kong,et al.  Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction , 2017, AAAI.

[38]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[39]  Niloy J. Mitra,et al.  Learning Semantic Deformation Flows with 3D Convolutional Networks , 2016, ECCV.

[40]  Oliver Grau,et al.  VConv-DAE: Deep Volumetric Shape Learning Without Object Labels , 2016, ECCV Workshops.

[41]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Max Jaderberg,et al.  Unsupervised Learning of 3D Structure from Images , 2016, NIPS.

[44]  Honglak Lee,et al.  Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[45]  Derek Hoiem,et al.  Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Silvio Savarese,et al.  DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[47]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[48]  Silvio Savarese,et al.  Weakly Supervised Generative Adversarial Networks for 3D Reconstruction , 2017, ArXiv.