Fast robust reconstruction of large-scale environments

This paper tackles the active research problem of fast automatic modeling of large-scale environments from videos and unorganized still image collections. We describe a scalable 3D reconstruction framework that leverages recent research in robust estimation, image-based recognition, and stereo depth estimation. High computational speed is achieved through parallelization and execution on commodity graphics hardware. For video, we have implemented a reconstruction system that works in real time; for still photo collections, we have a system that is capable of processing thousands of images in less than a day on a single commodity computer. Modeling results from both systems are shown on a variety of large-scale real-world datasets.

[1]  Armin Gruen,et al.  CC-MODELER : A TOPOLOGY GENERATOR FOR 3-D CITY MODELS , 1998 .

[2]  Tamara L. Berg,et al.  Automatic Ranking of Iconic Images , 2007 .

[3]  Paul A. Beardsley,et al.  Sequential Updating of Projective and Affine Structure from Motion , 1997, International Journal of Computer Vision.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Richard Szeliski,et al.  Piecewise planar stereo for image-based rendering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Ruigang Yang,et al.  Gain Adaptive Real-Time Stereo Streaming , 2007 .

[7]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Allen R. Hanson,et al.  Generalized parallel-perspective stereo mosaics from airborne video , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, CVPR.

[10]  Frank Dellaert,et al.  Out-of-Core Bundle Adjustment for Large-Scale 3D Reconstruction , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  E. Mikhail,et al.  Manual of Photogrammetry, 5th Edition , 2006 .

[12]  Luc Van Gool,et al.  Fast Compact City Modeling for Navigation Pre-Visualization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[14]  Frank Dellaert,et al.  Line-Based Structure from Motion for Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[15]  Alexander C. Berg,et al.  Finding iconic images , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[17]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.

[20]  Jan-Michael Frahm,et al.  A Comparative Analysis of RANSAC Techniques Leading to Adaptive Real-Time Random Sample Consensus , 2008, ECCV.

[21]  Antonio Criminisi,et al.  Harvesting Image Databases from the Web , 2007, ICCV.

[22]  Andrew Zisserman,et al.  New Techniques for Automated Architectural Reconstruction from Photographs , 2002, ECCV.

[23]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[24]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Jan-Michael Frahm,et al.  RANSAC for (Quasi-)Degenerate data (QDEGSAC) , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Horst Bischof,et al.  From structure-from-motion point clouds to fast location recognition , 2009, CVPR.

[27]  Ruigang Yang,et al.  Multi-resolution real-time stereo on commodity graphics hardware , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[28]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[29]  Reinhard Koch,et al.  Robust Calibration and 3D Geometric Modeling From Large Collections of Uncalibrated Images , 1999, DAGM-Symposium.

[30]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[32]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[33]  Richard Szeliski,et al.  Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[34]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[37]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[38]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[39]  Jan-Michael Frahm,et al.  Fast gain-adaptive KLT tracking on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.