论文信息 - GENERATIVE ADVERSARIAL NETWORKS FOR SINGLE PHOTO 3 D RECONSTRUCTION

GENERATIVE ADVERSARIAL NETWORKS FOR SINGLE PHOTO 3 D RECONSTRUCTION

Fast but precise 3D reconstructions of cultural heritage scenes are becoming very requested in the archaeology and architecture. While modern multi-image 3D reconstruction approaches provide impressive results in terms of textured surface models, it is often the need to create a 3D model for which only a single photo (or few sparse) is available. This paper focuses on the single photo 3D reconstruction problem for lost cultural objects for which only a few images are remaining. We use image-to-voxel translation network (Z-GAN) as a starting point. Z-GAN network utilizes the skip connections in the generator network to transfer 2D features to a 3D voxel model effectively (Figure 1). Therefore, the network can generate voxel models of previously unseen objects using object silhouettes present on the input image and the knowledge obtained during a training stage. In order to train our Z-GAN network, we created a large dataset that includes aligned sets of images and corresponding voxel models of an ancient Greek temple. We evaluated the Z-GAN network for single photo reconstruction on complex structures like temples as well as on lost heritage still available in crowdsourced images. Comparison of the reconstruction results with state-of-the-art methods are also presented and commented. Figure 1: Overview of the Z-GAN generator network employed for 3D reconstruction from a single image.

Fabio Remondino | V. Knyaz | V. Kniaz

[1] Simon J. Julier,et al. Structured Prediction of Unobserved Voxels from a Single Depth Image , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Fabio Remondino,et al. Image-to-Voxel Model Translation with Conditional Adversarial Networks , 2018, ECCV Workshops.

[3] Stefan Roth,et al. Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Vladlen Koltun,et al. Single-view reconstruction via joint analysis of image and shape collections , 2015, ACM Trans. Graph..

[5] Steven M. Seitz,et al. Multicore bundle adjustment , 2011, CVPR 2011.

[6] Fabio Remondino,et al. Image‐based 3D Modelling: A Review , 2006 .

[7] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[8] Abhinav Gupta,et al. Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[9] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[10] Jiajun Wu,et al. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[11] Vladimir V. Kniaz,et al. DEEP LEARNING FOR LOWTEXTURED IMAGE MATCHING , 2018 .

[12] Fabio Menna,et al. A CRITICAL REVIEW OF AUTOMATED PHOTOGRAMMETRICPROCESSING OF LARGE DATASETS , 2017 .

[13] Theodore Lim,et al. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks , 2016, ArXiv.

[14] Thomas A. Funkhouser,et al. Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Thomas Brox,et al. Multi-view 3D Models from Single Images with a Convolutional Network , 2015, ECCV.

[16] Fabio Remondino,et al. Human figure reconstruction and modeling from single image or monocular video sequence , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[17] Silvio Savarese,et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[18] Bo Yang,et al. 3D Object Reconstruction from a Single Depth View with Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[19] Fabio Poiesi,et al. 3D RECONSTRUCTION WITH A COLLABORATIVE APPROACHBASED ON SMARTPHONES AND A CLOUD-BASED SERVER , 2017 .

[20] Bo Yang,et al. 3D Object Dense Reconstruction from a Single Depth View , 2018, ArXiv.

[21] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[22] Yury Vizilter,et al. Deep Learning of Convolutional Auto-Encoder for Image Matching and 3D Object Reconstruction in the Infrared Range , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[23] Derek Hoiem,et al. Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24] Katsushi Ikeuchi,et al. Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Jiajun Wu,et al. MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[26] Fabio Remondino,et al. State of the art in high density image matching , 2014 .

[27] Sabry F. El-Hakim. A FLEXIBLE APPROACH TO 3D RECONSTRUCTION FROM SINGLE IMAGES , 2001, SIGGRAPH 2001.

[28] Jan-Michael Frahm,et al. Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Honglak Lee,et al. Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision , 2016, NIPS.

[30] Fabio Poiesi,et al. 3DNOW: IMAGE-BASED 3D RECONSTRUCTION AND MODELING VIA WEB , 2018 .

[31] Jan-Michael Frahm,et al. Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) , 2015, CVPR 2015.

[32] Diego Klabjan,et al. Generative Adversarial Nets for Multiple Text Corpora , 2017, 2021 International Joint Conference on Neural Networks (IJCNN).