DELS-MVS: Deep Epipolar Line Search for Multi-View Stereo

We propose a novel approach for deep learning-based Multi-View Stereo (MVS). For each pixel in the reference image, our method leverages a deep architecture to search for the corresponding point in the source image directly along the corresponding epipolar line. We denote our method DELS-MVS: Deep Epipolar Line Search Multi-View Stereo. Previous works in deep MVS select a range of interest within the depth space, discretize it, and sample the epipolar line according to the resulting depth values: this can result in an uneven scanning of the epipolar line, hence of the image space. Instead, our method works directly on the epipolar line: this guarantees an even scanning of the image space and avoids both the need to select a depth range of interest, which is often not known a priori and can vary dramatically from scene to scene, and the need for a suitable discretization of the depth space. In fact, our search is iterative, which avoids the building of a cost volume, costly both to store and to process. Finally, our method performs a robust geometry-aware fusion of the estimated depth maps, leveraging a confidence predicted alongside each depth. We test DELS-MVS on the ETH3D, Tanks and Temples and DTU benchmarks and achieve competitive results with respect to state-of-the-art approaches.

[1]  Yifei Shi,et al.  RayMVSNet++: Learning Ray-Based 1D Implicit Fields for Accurate Multi-View Stereo , 2023, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Shaoqian Wang,et al.  Efficient Multi-view Stereo by Iterative Dynamic Cost Volume , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  J. Álvarez,et al.  Non-parametric Depth Distribution Modelling based Depth Inference for Multi-view Stereo , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Zhenyu Wang,et al.  Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Marc Pollefeys,et al.  IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Zhenxing Mi,et al.  Generalized Binary Search Network for Highly-Efficient Multi-View Stereo , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Haotian Zhang,et al.  TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Friedrich Fraundorfer,et al.  IB-MVS: An Iterative Algorithm for Deep Multi-View Stereo based on Binary Decisions , 2021, BMVC.

[9]  Jingwei Huang,et al.  EPP-MVSNet: Epipolar-assembling based Depth Prediction for Multi-view Stereo , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Zachary Teed,et al.  RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching , 2021, 2021 International Conference on 3D Vision (3DV).

[11]  Jia Deng,et al.  DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras , 2021, NeurIPS.

[12]  Baochang Zhang,et al.  Long-range Attention Network for Multi-View Stereo , 2021, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13]  Silvano Galliani,et al.  PatchmatchNet: Learned Multi-View Patchmatch Stereo , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Mattia Rossi,et al.  BP-MVSNet: Belief-Propagation-Layers for Multi-View-Stereo , 2020, 2020 International Conference on 3D Vision (3DV).

[15]  Shiwei Li,et al.  Visibility-aware Multi-view Stereo Network , 2020, BMVC.

[16]  Yu-Wing Tai,et al.  Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking , 2020, ECCV.

[17]  Wenbing Tao,et al.  PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network , 2020, ArXiv.

[18]  Ying Wang,et al.  MARMVS: Matching Ambiguity Reduced Multiple View Stereo for Efficient Large Scale Scene Reconstruction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Zhuo Chen,et al.  Attention-Aware Multi-View Stereo , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jan Kautz,et al.  Bi3D: Stereo Depth Estimation via Binary Classifications , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Qingshan Xu,et al.  Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume , 2019, AAAI.

[22]  J. Álvarez,et al.  Cost Volume Pyramid Based Depth Inference for Multi-View Stereo , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Siyu Zhu,et al.  Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[25]  F. Fraundorfer,et al.  DeepC-MVS: Deep Confidence Prediction for Multi-View Stereo Reconstruction , 2019, 2020 International Conference on 3D Vision (3DV).

[26]  Erran L. Li,et al.  Deep Stereo Using Adaptive Thin Volume Representation With Uncertainty Awareness , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Long Quan,et al.  BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Shan Lin,et al.  Plane Completion and Filtering for Multi-View Stereo Reconstruction , 2019, GCPR.

[29]  Wenbing Tao,et al.  Multi-Scale Geometric Consistency Guided Multi-View Stereo , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Stephen Lin,et al.  Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Long Quan,et al.  MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[32]  Narendra Ahuja,et al.  DeepMVS: Learning Multi-view Stereopsis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Torsten Sattler,et al.  A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  ARNO KNAPITSCH,et al.  Tanks and temples , 2017, ACM Trans. Graph..

[35]  Serge J. Belongie,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[37]  Anders Bjorholm Dahl,et al.  Large-Scale Data for Multiple-View Stereopsis , 2016, International Journal of Computer Vision.

[38]  Konrad Schindler,et al.  Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Eli Shechtman,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, ACM Trans. Graph..

[41]  Jan-Michael Frahm,et al.  Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.