论文信息 - PHI-MVS: Plane Hypothesis Inference Multi-view Stereo for Large-Scale Scene Reconstruction

PHI-MVS: Plane Hypothesis Inference Multi-view Stereo for Large-Scale Scene Reconstruction

PatchMatch based Multi-view Stereo (MVS) algorithms have achieved great success in large-scale scene reconstruction tasks. However, reconstruction of texture-less planes often fails as similarity measurement methods may become ineffective on these regions. Thus, a new plane hypothesis inference strategy is proposed to handle the above issue. The procedure consists of two steps: First, multiple plane hypotheses are generated using filtered initial depth maps on regions that are not successfully recovered; Second, depth hypotheses are selected using Markov Random Field (MRF). The strategy can significantly improve the completeness of reconstruction results with only acceptable computing time increasing. Besides, a new acceleration scheme similar to dilated convolution can speed up the depth map estimating process with only a slight influence on the reconstruction. We integrated the above ideas into a new MVS pipeline, Plane Hypothesis Inference Multi-view Stereo (PHIMVS). The result of PHI-MVS is validated on ETH3D public benchmarks, and it demonstrates competing performance against the state-of-the-art.

[1] Jan-Michael Frahm,et al. Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Zhuo Chen,et al. Mesh-Guided Multi-View Stereo With Pyramid Architecture , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Francis Schmitt,et al. Silhouette and stereo fusion for 3D object modeling , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[4] Jiansheng Chen,et al. MVSCRF: Learning Multi-View Stereo With Conditional Random Fields , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Vladimir Kolmogorov,et al. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Qingshan Xu,et al. Planar Prior Assisted PatchMatch Multi-View Stereo , 2019, AAAI.

[7] Jiebo Luo,et al. Learning to Produce 3D Media From a Captured 2D Video , 2011, IEEE Transactions on Multimedia.

[8] Carsten Rother,et al. PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[9] Richard Szeliski,et al. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10] W. Freeman,et al. Generalized Belief Propagation , 2000, NIPS.

[11] Andrew J. Davison,et al. DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[12] Lena Maier-Hein,et al. Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery , 2013, Medical Image Anal..

[13] Anders Bjorholm Dahl,et al. Large-Scale Data for Multiple-View Stereopsis , 2016, International Journal of Computer Vision.

[14] Wenbing Tao,et al. PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network , 2020, ArXiv.

[15] Shan Lin,et al. Plane Completion and Filtering for Multi-View Stereo Reconstruction , 2019, GCPR.

[16] Torsten Sattler,et al. A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Shuhan Shen,et al. Accurate Multiple View 3D Reconstruction Using Patch-Based Stereo for Large-Scale Scenes , 2013, IEEE Transactions on Image Processing.

[18] Jie Li,et al. Fast and Adaptive 3D Reconstruction With Extensively High Completeness , 2017, IEEE Transactions on Multimedia.

[19] Adam Finkelstein,et al. PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[20] Ying Wang,et al. MARMVS: Matching Ambiguity Reduced Multiple View Stereo for Efficient Large Scale Scene Reconstruction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Olivier Stasse,et al. MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Yu-Wing Tai,et al. Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation , 2019, European Conference on Computer Vision.

[23] Pascal Fua,et al. Efficient large-scale multi-view stereo for ultra high-resolution image sets , 2011, Machine Vision and Applications.

[24] Michael Goesele,et al. Multi-View Stereo Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25] Olga Veksler,et al. Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[26] Ilya Kostrikov,et al. Probabilistic Labeling Cost for High-Accuracy Multi-view Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Vladimir Kolmogorov,et al. What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28] Jean Ponce,et al. Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Changchang Wu,et al. Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[30] Jan-Michael Frahm,et al. Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[31] Konrad Schindler,et al. Massively Parallel Multiview Stereopsis by Surface Normal Diffusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32] Wenbing Tao,et al. Multi-Scale Geometric Consistency Guided Multi-View Stereo , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Michael Goesele,et al. Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[34] Matteo Matteucci,et al. TAPA-MVS: Textureless-Aware PAtchMatch Multi-View Stereo , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35] Long Quan,et al. MVSNet: Depth Inference for Unstructured Multi-view Stereo , 2018, ECCV.

[36] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[37] Vladimir Kolmogorov,et al. Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Roberto Cipolla,et al. Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo , 2008, ECCV.