论文信息 - Learning Non-Volumetric Depth Fusion Using Successive Reprojections

Learning Non-Volumetric Depth Fusion Using Successive Reprojections

Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.

Andreas Geiger | Simon Donné | Andreas Geiger | S. Donné

[1] M. Goesele,et al. Floating scale surface reconstruction , 2014, ACM Trans. Graph..

[2] Lu Fang,et al. SurfaceNet: An End-to-End 3D Neural Network for Multiview Stereopsis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3] Wenbing Tao,et al. Multi-View Stereo with Asymmetric Checkerboard Propagation and Multi-Hypothesis Joint View Selection , 2018, ArXiv.

[4] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[5] Alex Kendall,et al. End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6] Seungyong Lee,et al. Reconstruction-Based Pairwise Depth Dataset for Depth Image Enhancement Using CNN , 2018, ECCV.

[7] Silvio Savarese,et al. 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[8] Nikos Komodakis,et al. Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Thomas Brox,et al. DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Thomas Brox,et al. Global, Dense Multiscale Reconstruction for a Billion Points , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12] Yann LeCun,et al. Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[13] Jan-Michael Frahm,et al. Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[14] Konrad Schindler,et al. Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15] Jitendra Malik,et al. Hierarchical Surface Prediction for 3D Object Reconstruction , 2017, 2017 International Conference on 3D Vision (3DV).

[16] Carsten Rother,et al. PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[17] Raquel Urtasun,et al. Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Carlos Hernandez,et al. Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[19] Stefano Mattoccia,et al. Beyond Local Reasoning for Stereo Confidence Estimation with Deep Learning , 2018, ECCV.

[20] Stefano Mattoccia,et al. Learning from scratch a confidence measure , 2016, BMVC.

[21] Jitendra Malik,et al. Learning a Multi-View Stereo Machine , 2017, NIPS.

[22] Michael M. Kazhdan,et al. Screened poisson surface reconstruction , 2013, TOGS.

[23] Lizhen Wang,et al. DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs , 2018, ECCV.

[24] Long Quan,et al. MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[25] Horst Bischof,et al. A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26] Shahram Izadi,et al. StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction , 2018, ECCV.

[27] Marc Levoy,et al. A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[28] Henrik Aanæs,et al. Large Scale Multi-view Stereopsis Evaluation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29] Andrew W. Fitzgibbon,et al. KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[30] Torsten Sattler,et al. A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] K. Schindler,et al. Gipuma: Massively Parallel Multi-view Stereo Reconstruction , 2016 .

[32] Luc Van Gool,et al. RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33] ARNO KNAPITSCH,et al. Tanks and temples , 2017, ACM Trans. Graph..

[34] Thomas Brox,et al. Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35] Horst Bischof,et al. Scalable Surface Reconstruction from Point Clouds with Extreme Scale and Density Diversity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Luc Van Gool,et al. Learned Multi-patch Similarity , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37] Thomas Brox,et al. FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38] Thomas Brox,et al. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Horst Bischof,et al. OctNetFusion: Learning Depth Fusion from Data , 2017, 2017 International Conference on 3D Vision (3DV).

[40] Narendra Ahuja,et al. DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41] Edmond Boyer,et al. Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency , 2018, ECCV.