Manhattan-world stereo

Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, stereo algorithms require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted walls). This paper presents a novel MVS approach to overcome these limitations for Manhattan World scenes, i.e., scenes that consists of piece-wise planar surfaces with dominant directions. Given a set of calibrated photographs, we first reconstruct textured regions using an existing MVS algorithm, then extract dominant plane directions, generate plane hypotheses, and recover per-view depth maps using Markov random fields. We have tested our algorithm on several datasets ranging from office interiors to outdoor buildings, and demonstrate results that outperform the current state of the art for such texture-poor scenes.

[1]  Carlo Tomasi,et al.  Multiway cut for stereo and motion with slanted surfaces , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Hai Tao,et al.  A global matching framework for stereo computation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[3]  Luc Van Gool,et al.  3D Urban Scene Modeling Integrating Recognition and Reconstruction , 2008, International Journal of Computer Vision.

[4]  Horst Bischof,et al.  Fusion of Feature- and Area-Based Information for Urban Buildings Modeling from Aerial Imagery , 2008, ECCV.

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[7]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrew Zisserman,et al.  New Techniques for Automated Architectural Reconstruction from Photographs , 2002, ECCV.

[9]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Seth J. Teller,et al.  Extracting textured vertical facades from controlled close-range imagery , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[11]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[12]  Alan L. Yuille,et al.  Manhattan World: compass direction from a single image by Bayesian inference , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[16]  Richard Szeliski,et al.  A layered approach to stereo reconstruction , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[17]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[18]  Kiriakos N. Kutulakos,et al.  A Theory of Shape by Space Carving , 2000, International Journal of Computer Vision.

[19]  KeeChang Lee,et al.  Fast Automatic Single-View 3-d Reconstruction of Urban Scenes , 2008, ECCV.

[20]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.