A TV Prior for High-Quality Local Multi-view Stereo Reconstruction

Local fusion of disparity maps allows fast parallel 3D modeling of large scenes that do not fit into main memory. While existing methods assume a constant disparity uncertainty, disparity errors typically vary spatially from tenths of pixels to several pixels. In this paper we propose a method that employs a set of Gaussians for different disparity classes, instead of a single error model with only one variance. The set of Gaussians is learned from the difference between generated disparity maps and ground-truth disparities. Pixels are assigned particular disparity classes based on a Total Variation (TV) feature measuring the local oscillation behavior of the 2D disparity map. This feature captures uncertainty caused for instance by lack of texture or fronto-parallel bias of the stereo method. Experimental results on several datasets in varying configurations demonstrate that our method yields improved performance both qualitatively and quantitatively.

[1]  Roberto Cipolla,et al.  Probabilistic visibility for multi-view stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[3]  Horst Bischof,et al.  A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[4]  Carlos Hernández,et al.  Video-based, real-time multi-view stereo , 2011, Image Vis. Comput..

[5]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[6]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[7]  Richard Szeliski,et al.  Efficient High-Resolution Stereo Matching Using Local Plane Sweeps , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  George Vogiatzis,et al.  A Generative Model for Online Depth Fusion , 2012, ECCV.

[9]  Daniel Cremers,et al.  Continuous Global Optimization in Multiview 3D Reconstruction , 2007, EMMCVPR.

[10]  Heiko Hirschmüller,et al.  Multi-Resolution Range Data Fusion for Multi-View Stereo Reconstruction , 2013, GCPR.

[11]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[12]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[13]  Joachim Weickert,et al.  Anisotropic Range Image Integration , 2012, DAGM/OAGM Symposium.

[14]  Sebastian Thrun,et al.  Learning Occupancy Grid Maps with Forward Sensor Models , 2003, Auton. Robots.

[15]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Helmut Mayer,et al.  Incremental Division of Very Large Point Clouds for Scalable 3D Surface Reconstruction , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[17]  Heiko Hirschmüller,et al.  Dense 3D Reconstruction from Wide Baseline Image Sets , 2011, Theoretical Foundations of Computer Vision.

[18]  Andreas Birk,et al.  3D forward sensor modeling and application to occupancy grid based sensor fusion , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Michael M. Kazhdan,et al.  Unconstrained isosurface extraction on arbitrary octrees , 2007, Symposium on Geometry Processing.

[20]  Silvio Savarese,et al.  Dense Object Reconstruction with Semantic Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[23]  Katsushi Ikeuchi,et al.  Consensus surfaces for modeling 3D objects from multiple range images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[24]  Heiko Hirschmüller,et al.  Evaluation of Stereo Matching Costs on Images with Radiometric Differences , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Jean-Philippe Pons,et al.  High Accuracy and Visibility-Consistent Dense Multiview Stereo , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[28]  Michael Brady,et al.  Practical Structure and Motion from Stereo When Motion is Unconstrained , 2000, International Journal of Computer Vision.

[29]  H. Hirschmüller Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Stereo Processing by Semi-global Matching and Mutual Information , 2022 .

[30]  Hendrik P. A. Lensch,et al.  Multi-View Depth Map Estimation With Cross-View Consistency , 2014, BMVC.

[31]  Marc Pollefeys,et al.  Joint 3D Scene Reconstruction and Class Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Hendrik P. A. Lensch,et al.  Scale Robust Multi View Stereo , 2012, ECCV.

[33]  Larry H. Matthies,et al.  Error analysis of a real-time stereo system , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Katsushi Ikeuchi,et al.  Adaptively merging large-scale range data with reflectance properties , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  M. Goesele,et al.  Floating scale surface reconstruction , 2014, ACM Trans. Graph..

[36]  Thomas Brox,et al.  An Iterated L1 Algorithm for Non-smooth Non-convex Optimization in Computer Vision , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Michael Goesele,et al.  Multi-View Stereo Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  C. Zach Fast and High Quality Fusion of Depth Maps , 2008 .

[40]  Xiaoyan Hu,et al.  Least Commitment, Viewpoint-Based, Multi-view Stereo , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[41]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[42]  Michael Goesele,et al.  Surface Reconstruction from Multi-resolution Sample Points , 2011, VMV.

[43]  Daniel Cremers,et al.  Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences , 2013, 2013 IEEE International Conference on Computer Vision.

[44]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45]  M. Goesele,et al.  Fusion of depth maps with multiple scales , 2011, ACM Trans. Graph..

[46]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[47]  Ryo Furukawa,et al.  Improved Space Carving Method for Merging and Interpolating Multiple Range Images Using Information of Light Sources of Active Stereo , 2007, ACCV.

[48]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Tim Bodenmüller,et al.  Streaming surface reconstruction from real time 3D-measurements , 2009 .