Image-based street-side city modeling

We propose an automatic approach to generate street-side 3D photo-realistic models from images captured along the streets at ground level. We first develop a multi-view semantic segmentation method that recognizes and segments each image at pixel level into semantically meaningful areas, each labeled with a specific object class, such as building, sky, ground, vegetation and car. A partition scheme is then introduced to separate buildings into independent blocks using the major line structures of the scene. Finally, for each block, we propose an inverse patch-based orthographic composition and structure analysis method for façade modeling that efficiently regularizes the noisy and missing reconstructed 3D data. Our system has the distinct advantage of producing visually compelling results by imposing strong priors of building regularity. We demonstrate the fully automatic system on a typical city example to validate our methodology.

[1]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[2]  Luc Van Gool,et al.  3D Urban Scene Modeling Integrating Recognition and Reconstruction , 2008, International Journal of Computer Vision.

[3]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[4]  Frank Dellaert,et al.  Line-Based Structure from Motion for Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[5]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, ACM Trans. Graph..

[6]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[8]  Philip H. S. Torr,et al.  VideoTrace: rapid interactive scene modelling from video , 2007, ACM Trans. Graph..

[9]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[10]  KeeChang Lee,et al.  Fast Automatic Single-View 3-d Reconstruction of Urban Scenes , 2008, ECCV.

[11]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[13]  Christian Früh,et al.  Constructing 3D city models by merging ground-based and airborne views , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[14]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[15]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17]  Lukas Zebedin,et al.  Towards 3D map generation from digital aerial images , 2006 .

[18]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[19]  Andrew Zisserman,et al.  Model Selection for Automated Architectural Reconstruction from Multiple Views , 2002 .

[20]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[21]  Richard Szeliski,et al.  Interactive 3D architectural modeling from unordered photo collections , 2008, SIGGRAPH Asia '08.

[22]  Long Quan,et al.  A quasi-dense approach to surface reconstruction from uncalibrated images , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[24]  Roberto Cipolla,et al.  Modelling and Interpretation of Architecture from Several Images , 2004, International Journal of Computer Vision.

[25]  Frédo Durand,et al.  A gentle introduction to bilateral filtering and its applications , 2007, SIGGRAPH Courses.

[26]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[27]  Jianxiong Xiao,et al.  Image-based façade modeling , 2008, ACM Trans. Graph..

[28]  Christian Früh,et al.  Automated reconstruction of building facades for virtual walk-thrus , 2003, SIGGRAPH '03.

[29]  Mark de Berg,et al.  Computational geometry: algorithms and applications, 3rd Edition , 1997 .

[30]  Luc Van Gool,et al.  Image-based procedural modeling of facades , 2007, ACM Trans. Graph..

[31]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[32]  Alan L. Yuille,et al.  Manhattan World: compass direction from a single image by Bayesian inference , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[33]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[34]  Ioannis Stamos,et al.  Geometry and Texture Recovery of Scenes of Large Scale , 2002, Comput. Vis. Image Underst..