Voxel-Based 3D Object Reconstruction from Single 2D Image Using Variational Autoencoders

In recent years, learning-based approaches for 3D reconstruction have gained much popularity due to their encouraging results. However, unlike 2D images, 3D cannot be represented in its canonical form to make it computationally lean and memory-efficient. Moreover, the generation of a 3D model directly from a single 2D image is even more challenging due to the limited details available from the image for 3D reconstruction. Existing learning-based techniques still lack the desired resolution, efficiency, and smoothness of the 3D models required for many practical applications. In this paper, we propose voxel-based 3D object reconstruction (V3DOR) from a single 2D image for better accuracy, one using autoencoders (AE) and another using variational autoencoders (VAE). The encoder part of both models is used to learn suitable compressed latent representation from a single 2D image, and a decoder generates a corresponding 3D model. Our contribution is twofold. First, to the best of the authors’ knowledge, it is the first time that variational autoencoders (VAE) have been employed for the 3D reconstruction problem. Second, the proposed models extract a discriminative set of features and generate a smoother and high-resolution 3D model. To evaluate the efficacy of the proposed method, experiments have been conducted on a benchmark ShapeNet data set. The results confirm that the proposed method outperforms state-of-the-art methods.

[1]  Michael J. Black,et al.  Dyna: a model of dynamic human shape in motion , 2015, ACM Trans. Graph..

[2]  J. Tenenbaum,et al.  MarrNet : 3 D Shape Reconstruction via 2 . 5 D Sketches , 2017 .

[4]  Pierre Alliez,et al.  A Survey of Surface Reconstruction from Point Clouds , 2017, Comput. Graph. Forum.

[5]  Xiuzhen Huang,et al.  Novel Low Cost 3D Surface Model Reconstruction System for Plant Phenotyping , 2017, J. Imaging.

[6]  Agnès Voisard,et al.  Estimating and abstracting the 3D structure of feline bones using neural networks on X-ray (2D) images , 2020, Communications Biology.

[7]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[8]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[9]  Tom White,et al.  Generative Adversarial Networks: An Overview , 2017, IEEE Signal Processing Magazine.

[10]  Jiansheng Peng,et al.  Single image 3D object reconstruction based on deep learning: A review , 2020, Multimedia Tools and Applications.

[11]  Xiaoping Liu,et al.  Single View 3D Reconstruction Based on Improved RGB-D Image , 2020, IEEE Sensors Journal.

[12]  Fangqiao Hu,et al.  Structure‐aware 3D reconstruction for cable‐stayed bridges: A learning‐based method , 2020, Comput. Aided Civ. Infrastructure Eng..

[13]  Yue Wang,et al.  Im2Avatar: Colorful 3D Reconstruction from a Single Image , 2018, ArXiv.

[14]  Diederik P. Kingma,et al.  An Introduction to Variational Autoencoders , 2019, Found. Trends Mach. Learn..