A Heightmap Model for Efficient 3 D Reconstruction from Street-Level Video

This paper introduces a fast approach for automatic dense large scale 3D urban reconstruction from video. The presented system uses a novel multi-view depthmap fusion algorithm where the surface is represented by a heightmap. Whereas most other systems attempt to produce true 3D surfaces, our simplified model can be called a 2.5D representation. While this model seems to be a more natural fit to aerial and satellite data, we have found it to also be a powerful representation for ground-level reconstructions. It has the advantage of producing purely vertical facades, and it also yields a continuous surface without holes. Compared to more general 3D reconstruction methods, our algorithm is more efficient, uses less memory, and produces more compact models at the expense of losing some detail. Our GPU implementation can compute a 200 × 200 heightmap from 64 depthmaps in just 92 milliseconds. We demonstrate our system on a variety of challenging ground-level datasets including large buildings, residential houses, and store front facades obtaining clean, complete, compact, and visually pleasing 3D models.

[1]  S. Teller Automated urban model acquisition : Project rationale and status , 1999 .

[2]  Christian Früh,et al.  Data Processing Algorithms for Generating Textured 3D Building Facade Meshes from Laser Scans and Camera Images , 2005, International Journal of Computer Vision.

[3]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[4]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[5]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Luc Van Gool,et al.  Fast Compact City Modeling for Navigation Pre-Visualization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Luc Van Gool,et al.  3D Urban Scene Modeling Integrating Recognition and Reconstruction , 2008, International Journal of Computer Vision.

[8]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[9]  Ruigang Yang,et al.  Gain Adaptive Real-Time Stereo Streaming , 2007 .

[10]  Horst Bischof,et al.  A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Horst Bischof,et al.  Fusion of Feature- and Area-Based Information for Urban Buildings Modeling from Aerial Imagery , 2008, ECCV.

[12]  Jianxiong Xiao,et al.  Image-based street-side city modeling , 2009, ACM Trans. Graph..

[13]  Richard Szeliski,et al.  Piecewise planar stereo for image-based rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, CVPR.