Multi-Sensor Large-Scale Dataset for Multi-View 3D Reconstruction

We present a new multi-sensor dataset for multi-view 3D surface reconstruction. It includes registered RGB and depth data from sensors of different resolutions and modalities: smartphones, Intel RealSense, Microsoft Kinect, industrial cameras, and structured-light scanner. The scenes are selected to emphasize a diverse set of material properties challenging for existing algorithms. We provide around 1.4 million images of 107 different scenes acquired from 100 viewing directions under 14 lighting conditions. We expect our dataset will be useful for evaluation and training of 3D reconstruction algorithms and for related tasks. The dataset is available at skoltech3d.appliedai.tech.

[1]  V. Lempitsky,et al.  NPBG++: Accelerating Neural Point-Based Graphics , 2022, ArXiv.

[2]  P. Tan,et al.  A Real World Dataset for Multi-view 3D Reconstruction , 2022, ECCV.

[3]  Zhenyu Wang,et al.  Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Afshin Dehghan,et al.  ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data , 2021, NeurIPS Datasets and Benchmarks.

[5]  Jingwei Huang,et al.  EPP-MVSNet: Epipolar-assembling based Depth Prediction for Multi-view Stereo , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Francois Rameau,et al.  VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[7]  Marc Pollefeys,et al.  Pixel-Perfect Structure-from-Motion with Featuremetric Refinement , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Guoping Wang,et al.  AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  C. Theobalt,et al.  NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction , 2021, NeurIPS.

[10]  Andreas Geiger,et al.  UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  Chunjie Zhang,et al.  Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Dan B. Goldman,et al.  Neural RGB-D Surface Reconstruction , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Hujun Bao,et al.  NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Pratul P. Srinivasan,et al.  IBRNet: Learning Multi-View Image-Based Rendering , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bart Goossens,et al.  Simultaneous Localization and Mapping Related Datasets: A Comprehensive Survey , 2021, 2102.04036.

[16]  Marc Pollefeys,et al.  DeepSurfels: Learning Online Appearance Fusion , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ira Kemelmacher-Shlizerman,et al.  Real-Time High-Resolution Background Matting , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Shimin Hu,et al.  DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Silvano Galliani,et al.  PatchmatchNet: Learned Multi-View Patchmatch Stereo , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Johannes L. Schönberger,et al.  NeuralFusion: Online Depth Fusion in Latent Space , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Zitian Zhang,et al.  Multi-Scale Progressive Fusion Learning for Depth Map Super-Resolution , 2020, ArXiv.

[22]  Gernot Riegler,et al.  Stable View Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Shiwei Li,et al.  Visibility-aware Multi-view Stereo Network , 2020, BMVC.

[24]  Gernot Riegler,et al.  Free View Synthesis , 2020, ECCV.

[25]  Justus Thies,et al.  SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ruigang Yang,et al.  Channel Attention Based Iterative Residual Learning for Depth Map Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Zhuo Chen,et al.  Attention-Aware Multi-View Stereo , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Hugo Germain,et al.  S2DNet: Learning Accurate Correspondences for Sparse-to-Dense Feature Matching , 2020, ArXiv.

[29]  Vijay Badrinarayanan,et al.  Atlas: End-to-End 3D Scene Reconstruction from Posed Images , 2020, ECCV.

[30]  Ronen Basri,et al.  Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance , 2020, NeurIPS.

[31]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[32]  Hans-Gerd Maas,et al.  Assessing the Influence of Temperature Changes on the Geometric Stability of Smartphone- and Raspberry Pi Cameras , 2020, Sensors.

[33]  Johannes L. Schönberger,et al.  RoutedFusion: Learning Real-Time Depth Map Fusion , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Qingshan Xu,et al.  Planar Prior Assisted PatchMatch Multi-View Stereo , 2019, AAAI.

[35]  Andreas Geiger,et al.  Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Siyu Zhu,et al.  Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Torsten Sattler,et al.  Why Having 10,000 Parameters in Your Camera Model Is Better Than Twelve , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  F. Fraundorfer,et al.  DeepC-MVS: Deep Confidence Prediction for Multi-View Stereo Reconstruction , 2019, 2020 International Conference on 3D Vision (3DV).

[39]  M. Nießner,et al.  SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Long Quan,et al.  BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  In So Kweon,et al.  Camera Exposure Control for Robust Robot Vision with Noise-Aware Image Quality Assessment , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[42]  Victor Lempitsky,et al.  Neural Point-Based Graphics , 2019, ECCV.

[43]  Andreas Geiger,et al.  Learning Non-Volumetric Depth Fusion Using Successive Reprojections , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Paul Debevec,et al.  DeepView: View Synthesis With Learned Gradient Descent , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Wenbing Tao,et al.  Multi-Scale Geometric Consistency Guided Multi-View Stereo , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Sergey Pavlov,et al.  “Zhores” — Petaflops supercomputer for data-driven modeling, machine learning and artificial intelligence installed in Skolkovo Institute of Science and Technology , 2019, Open Engineering.

[47]  Evgeny Burnaev,et al.  Perceptual Deep Depth Super-Resolution , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Jan Kautz,et al.  Extreme View Synthesis , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  Torsten Sattler,et al.  SurfelMeshing: Online Surfel-Based Mesh Reconstruction , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Edmond Boyer,et al.  Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency , 2018, ECCV.

[51]  Seungyong Lee,et al.  Reconstruction-Based Pairwise Depth Dataset for Depth Image Enhancement Using CNN , 2018, ECCV.

[52]  Long Quan,et al.  MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[53]  Yinda Zhang,et al.  Deep Depth Completion of a Single RGB-D Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Vladlen Koltun,et al.  Open3D: A Modern Library for 3D Data Processing , 2018, ArXiv.

[55]  Matthias Nießner,et al.  ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Vladlen Koltun,et al.  Colored Point Cloud Registration Revisited , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Matthias Nießner,et al.  Matterport3D: Learning from RGB-D Data in Indoor Environments , 2017, 2017 International Conference on 3D Vision (3DV).

[58]  Tae-Hyun Oh,et al.  Gradient-Based Camera Exposure Control for Outdoor Mobile Platforms , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[59]  Torsten Sattler,et al.  A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[63]  Olaf Hellwich,et al.  SyB3R: A Realistic Synthetic Benchmark for 3D Reconstruction from Images , 2016, ECCV.

[64]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[65]  Stefan Leutenegger,et al.  ElasticFusion: Real-time dense SLAM and light source estimation , 2016, Int. J. Robotics Res..

[66]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Anders Bjorholm Dahl,et al.  Large-Scale Data for Multiple-View Stereopsis , 2016, International Journal of Computer Vision.

[68]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[69]  Didier Stricker,et al.  CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2 , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[70]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[71]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Vladlen Koltun,et al.  Robust reconstruction of indoor scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[74]  Pieter Abbeel,et al.  BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[75]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[76]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[77]  Henrik Aanæs,et al.  Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Vladlen Koltun,et al.  Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  Ben Glocker,et al.  Real-time RGB-D camera relocalization , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[80]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[81]  Roberto Scopigno,et al.  Global refinement of image-to-geometry registration for color projection , 2013, 2013 Digital Heritage International Congress (DigitalHeritage).

[82]  Sebastian Thrun,et al.  Unsupervised Intrinsic Calibration of Depth Sensors via SLAM , 2013, Robotics: Science and Systems.

[83]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[84]  Gabriel Taubin,et al.  A benchmark for surface reconstruction , 2013, TOGS.

[85]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[86]  Juho Kannala,et al.  Joint Depth and Color Camera Calibration with Distortion Correction , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[87]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[88]  Daniel Cremers,et al.  Multiview Stereo and Silhouette Consistency via Convex Functionals over Convex Domains , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[90]  Roberto Scopigno,et al.  Image‐to‐Geometry Registration: a Mutual Information Method exploiting Illumination‐related Geometric Properties , 2009, Comput. Graph. Forum.

[91]  Eli Shechtman,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, ACM Trans. Graph..

[92]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[94]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[95]  Sebastian Koch Hardware Design and Accurate Simulation for Benchmarking of 3D Reconstruction Algorithms , 2021 .

[96]  Long Quan,et al.  BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks , 2019, Computer Vision and Pattern Recognition.

[97]  Song Wu,et al.  3 D ShapeNets : A Deep Representation for Volumetric Shape Modeling , 2015 .

[98]  A. Khosla,et al.  A Deep Representation for Volumetric Shape Modeling , 2015 .

[99]  M. Powell The NEWUOA software for unconstrained optimization without derivatives , 2006 .

[100]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .