Real-time exploration and photorealistic reconstruction of large natural environments

This paper presents a hybrid (geometry- and image-based) framework suitable for providing photorealistic walkthroughs of large, complex outdoor scenes, based only on a small set of real images from the scene. To this end, a novel data representation of a 3D scene is proposed, which is called morphable 3D panoramas. Motion is assumed to be taking place along a predefined path of the 3D environment and the input to the system is a sparse set of stereoscopic views at certain positions (key positions) along that path (one view per position). An approximate local 3D model is constructed from each view, capable of capturing the photometric and geometric properties of the scene only locally. Then, during the rendering process, a continuous morphing (both photometric as well as geometric) takes place between successive local 3D models, using what we call a ‘morphable 3D model’. For the estimation of the photometric morphing, a robust algorithm capable of extracting a dense field of 2D correspondences between wide-baseline images is used, whereas, for the geometric morphing, a novel method of computing 3D correspondences between local models is proposed. In this way, a physically valid morphing is always produced, which is thus kept transparent from the user. Moreover, a highly optimized rendering path is used during morphing. Thanks to the use of appropriate pixel and vertex shaders, this rendering path can be run fully in 3D graphics hardware and thus allows for high frame rates.Our system can be extended to handle multiple stereoscopic views (and therefore multiple local models) per key position of the path (related by a camera rotation). In this case, one local 3D panorama (per key position) is constructed, comprising all local 3D models therein, and so a ‘morphable 3D panorama’ is now used during the rendering process. For handling the geometric consistency of each 3D panorama, a technique which is based on solving a partial differential equation is adopted. The effectiveness of our framework is demonstrated by using it for the 3D visual reconstruction of the Samaria Gorge in Crete.

[1]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[2]  D. Zwillinger Handbook of differential equations , 1990 .

[3]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[4]  Michael S. Landy,et al.  Computational models of visual processing , 1991 .

[5]  Reinhard Koch,et al.  3-D surface reconstruction from stereoscopic image sequences , 1995, Proceedings of IEEE International Conference on Computer Vision.

[6]  Leonard McMillan,et al.  Plenoptic Modeling: An Image-Based Rendering System , 2023 .

[7]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[8]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[9]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[10]  Richard Szeliski,et al.  Video mosaics for virtual environments , 1996, IEEE Computer Graphics and Applications.

[11]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[12]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[13]  Paul S. Heckbert,et al.  Survey of Polygonal Surface Simplification Algorithms , 1997 .

[14]  Jr. Leonard McMillan,et al.  An Image-Based Approach to Three-Dimensional Computer Graphics , 1997 .

[15]  Mei Han,et al.  Interactive 3D Modeling from Multiple Images Using Scene Regularities , 1998, SMILE.

[16]  S. Teller Automated urban model acquisition : Project rationale and status , 1999 .

[17]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Daniel Cohen-Or,et al.  Navigating through sparse views , 1999, VRST '99.

[19]  Ramesh Raskar,et al.  Image-based visual hulls , 2000, SIGGRAPH.

[20]  Luc Van Gool,et al.  Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions , 2000, BMVC.

[21]  Michael Bosse,et al.  Unstructured lumigraph rendering , 2001, SIGGRAPH.

[22]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[23]  Olivier D. Faugeras,et al.  The geometry of multiple images - the laws that govern the formation of multiple images of a scene and some of their applications , 2001 .

[24]  Hans-Peter Seidel,et al.  On‐the‐Fly Processing of Generalized Lumigraphs , 2001, Comput. Graph. Forum.

[25]  Jean-Yves Bouguet,et al.  Camera calibration toolbox for matlab , 2001 .

[26]  G. Tziritas,et al.  Robust pan, tilt and zoom estimation , 2002, 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628).

[27]  Luc Van Gool,et al.  PDE-based multi-view depth estimation , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[28]  Daniel G. Aliaga,et al.  Sea of images , 2002, IEEE Visualization, 2002. VIS 2002..

[29]  Takeo Kanade,et al.  Spatio-Temporal View Interpolation , 2002, Rendering Techniques.

[30]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[31]  Luc Van Gool,et al.  Dense matching of multiple wide-baseline views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[32]  Richard Szeliski,et al.  High-quality Image-based Interactive Exploration of Real-World Environments 1 , 2003 .

[33]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[34]  Nikos Komodakis,et al.  Interactive walkthroughs using "morphable 3D-mosaics" , 2004, Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004..

[35]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[36]  Richard Szeliski,et al.  High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[37]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[38]  D. Nistér Automatic passive recovery of 3D from images and video , 2004, Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004..

[39]  A. Cañada,et al.  Handbook of differential equations , 2004 .

[40]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[41]  Mark Segal,et al.  The OpenGL Graphics System: A Specification , 2004 .

[42]  C. Strecha,et al.  Wide-baseline stereo from multiple views: A probabilistic account , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[43]  Seth J. Teller,et al.  Spherical Mosaics with Quaternions and Dense Correlation , 2000, International Journal of Computer Vision.

[44]  Richard Szeliski,et al.  Image-based interactive exploration of real-world environments , 2004, IEEE Computer Graphics and Applications.

[45]  Pietro Perona,et al.  Evaluation of Features Detectors and Descriptors based on 3D Objects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[46]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[47]  Nikos Komodakis,et al.  A new framework for approximate labeling via graph cuts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[48]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..