WIRE STRUCTURE IMAGE-BASED 3D RECONSTRUCTION AIDED BY DEEP LEARNING

Abstract. Objects and structures realized by connecting and bending wires are common in modern architecture, furniture design, metal sculpting, etc. The 3D reconstruction of such objects with traditional range- or image-based methods is very difficult and poses challenges due to their unique characteristics such as repeated structures, slim elements, holes, lack of features, self-occlusions, etc. Complete 3D models of such complex structures are normally reconstructed with lots of manual intervention as automated processes fail in providing detailed and accurate 3D reconstruction results. This paper presents the image-based 3D reconstruction of the Shukhov hyperboloid tower in Moscow, a wire structure built in 1922, composed of a series of hyperboloid sections stacked one to another to approximate an overall conical shape. A deep learning approach for image segmentation was developed in order to robustly detect wire structures in images and provide the basis for accurate corresponding problem solutions. The developed WireNet convolution neural network (CNN) model has been used to aid the multi-view stereo (MVS) process and to improve robustness and accuracy of the image-based 3D reconstruction approach, otherwise not feasible without masking the images automatically.

[1]  Vladimir Grigorʹevich Shukhov,et al.  Vladimir G. Šuchov 1853-1939 : die Kunst der sparsamen Konstruktion , 1990 .

[2]  Markus J Buehler,et al.  Imaging and analysis of a three-dimensional spider web architecture , 2018, Journal of The Royal Society Interface.

[3]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[4]  Benjamin B. Kimia,et al.  From Multiview Image Curves to 3D Drawings , 2016, ECCV.

[5]  Pascal Fua,et al.  LF-Net: Learning Local Features from Images , 2018, NeurIPS.

[6]  Fabio Remondino,et al.  GENERATIVE ADVERSARIAL NETWORKS FOR SINGLE PHOTO 3D RECONSTRUCTION , 2019, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[7]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[8]  Wenping Wang,et al.  Image-based reconstruction of wire art , 2017, ACM Trans. Graph..

[9]  Kotaro Morioka,et al.  Reconstruction of Wire Structures from Scanned Point Clouds , 2013, ISVC.

[10]  Vladimir V. Kniaz,et al.  GANcoder: robust feature point matching using conditional adversarial auto-encoder , 2020 .

[11]  A. V. Leonov,et al.  Laser scanning and 3D modeling of the Shukhov hyperboloid tower in Moscow , 2015 .

[12]  Chenglu Wen,et al.  RF-Net: An End-To-End Image Matching Network Based on Receptive Field , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Henrik Karstoft,et al.  UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor , 2019, ArXiv.

[15]  Justus H. Piater,et al.  Sampling-Based Multiview Reconstruction without Correspondences for 3D Edges , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[16]  Daniel Cohen-Or,et al.  L1-medial skeleton of point cloud , 2013, ACM Trans. Graph..

[17]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Fabio Remondino,et al.  Multi view stereo with semantic priors , 2020, ArXiv.

[19]  Fabio Remondino,et al.  Image-to-Voxel Model Translation with Conditional Adversarial Networks , 2018, ECCV Workshops.

[20]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Vladimir V. Kniaz Conditional GANs for semantic segmentation of multispectral satellite images , 2018, Remote Sensing.

[22]  Derek Hoiem,et al.  Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[25]  Jingdong Wang,et al.  Interleaved Group Convolutions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Charless C. Fowlkes,et al.  3D Scene Reconstruction With Multi-Layer Depth and Epipolar Transformers , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Abhinav Gupta,et al.  Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.

[28]  Vladimir V. Kniaz,et al.  Deep learning for dense labeling of hydrographic regions in very high resolution imagery , 2019, Remote Sensing.

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[31]  Tobias Martin,et al.  Topology-aware reconstruction of thin tubular structures , 2014, SIGGRAPH ASIA Technical Briefs.

[32]  Mattia Rossi,et al.  DeepC-MVS: Deep Confidence Prediction for Multi-View Stereo Reconstruction , 2020, 2020 International Conference on 3D Vision (3DV).

[33]  Shengping Zhang,et al.  Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).