Detailed Real-Time Urban 3D Reconstruction from Video

Abstract The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU’s to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.

[1]  C. A. HART,et al.  Manual of Photogrammetry , 1947, Nature.

[2]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[3]  T. J. Lauterborn American Society Of Photogrammetry , 1984 .

[4]  O. D. Faugeras,et al.  Camera Self-Calibration: Theory and Experiments , 1992, ECCV.

[5]  Pietro Perona,et al.  Recursive motion and structure estimation with complete error characterization , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Marc Levoy,et al.  Zippered polygon meshes from range images , 1994, SIGGRAPH.

[8]  G. Salgian,et al.  Electronically directed "focal" stereo , 1995, Proceedings of IEEE International Conference on Computer Vision.

[9]  S. P. Mudur,et al.  Three-dimensional computer vision: a geometric viewpoint , 1993 .

[10]  Denis Laurendeau,et al.  A General Surface Approach to the Integration of a Set of Range Views , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Alex Pentland,et al.  Recursive Estimation of Motion, Structure, and Focal Length , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[13]  Adrian Hilton,et al.  Reliable Surface Reconstructiuon from Multiple Range Images , 1996, ECCV.

[14]  Robert T. Collins,et al.  A space-sweep approach to true multi-image matching , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  J. V. Ness,et al.  Robust calibration , 1997 .

[16]  R. Hartley Triangulation, Computer Vision and Image Understanding , 1997 .

[17]  Richard A. Brown,et al.  Introduction to random signals and applied kalman filtering (3rd ed , 2012 .

[18]  Michael Garland,et al.  Surface simplification using quadric error metrics , 1997, SIGGRAPH.

[19]  Reinhard Koch,et al.  Multi Viewpoint Stereo from Uncalibrated Video Sequences , 1998, ECCV.

[20]  Andrew W. Fitzgibbon,et al.  Automatic Camera Recovery for Closed or Open Image Sequences , 1998, ECCV.

[21]  Armin Gruen,et al.  CC-MODELER : A TOPOLOGY GENERATOR FOR 3-D CITY MODELS , 1998 .

[22]  Katsushi Ikeuchi,et al.  Consensus surfaces for modeling 3D objects from multiple range images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[23]  Armin B. Cremers,et al.  Extracting Buildings from Aerial Images Using Hierarchical Aggregation in 2D and 3D , 1998, Comput. Vis. Image Underst..

[24]  Takeo Kanade,et al.  Constructing virtual worlds using dense stereo , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[25]  Reinhard Koch,et al.  Robust Calibration and 3D Geometric Modeling From Large Collections of Uncalibrated Images , 1999, DAGM-Symposium.

[26]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[27]  Carlo Tomasi,et al.  Multiway cut for stereo and motion with slanted surfaces , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Takeo Kanade,et al.  Image-consistent surface triangulation , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[29]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[30]  Richard Szeliski,et al.  Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[31]  Mohinder S. Grewal,et al.  Kalman Filtering: Theory and Practice Using MATLAB , 2001 .

[32]  Andrew Zisserman,et al.  Multiple view geometry in computer visiond , 2001 .

[33]  Stefano Soatto,et al.  Real-time feature tracking and outlier rejection with changes in illumination , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[34]  Andrew Zisserman,et al.  New Techniques for Automated Architectural Reconstruction from Photographs , 2002, ECCV.

[35]  Ioannis Stamos,et al.  Geometry and Texture Recovery of Scenes of Large Scale , 2002, Comput. Vis. Image Underst..

[36]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[37]  R. Pajarola Overview of Quadtree-based Terrain Triangulation and Visualization , 2002 .

[38]  R. Pajarola,et al.  Fast Depth-Image Meshing and Warping , 2002 .

[39]  Trevor Darrell,et al.  Fast 3D model acquisition from stereo images , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[40]  Michael Bosse,et al.  Vanishing points and 3D lines from omnidirectional video , 2002, Proceedings. International Conference on Image Processing.

[41]  D. Lingaiah Kalman filtering: Theory and practice using MATLAB, 2nd ed [Book Review] , 2003, IEEE Circuits and Devices Magazine.

[42]  Michael Bosse,et al.  Vanishing points and three-dimensional lines from omni-directional video , 2003, The Visual Computer.

[43]  Ruigang Yang,et al.  Multi-resolution real-time stereo on commodity graphics hardware , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[44]  Antonio Vettore,et al.  Effective 3D modeling of heritage sites , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[45]  Ruigang Yang,et al.  Improved Real-Time Stereo on Commodity Graphics Hardware , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[46]  Reinhard Koch,et al.  Self-Calibration and Metric Reconstruction Inspite of Varying and Unknown Intrinsic Camera Parameters , 1999, International Journal of Computer Vision.

[47]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[49]  Allen R. Hanson,et al.  Generalized parallel-perspective stereo mosaics from airborne video , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Paul A. Beardsley,et al.  Sequential Updating of Projective and Affine Structure from Motion , 1997, International Journal of Computer Vision.

[51]  Kostas Daniilidis,et al.  Multi-camera reconstruction based on surface normal estimation and best viewpoint selection , 2004, Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004..

[52]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[53]  Pascal Fua,et al.  From Multiple Stereo Views to Multiple 3-D Surfaces , 1997, International Journal of Computer Vision.

[54]  Naokazu Yokoya,et al.  Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera , 2004, International Journal of Computer Vision.

[55]  Richard Szeliski,et al.  Sampling the disparity space image , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Yiannis Aloimonos,et al.  Stereo correspondence with slanted surfaces: critical implications of horizontal slant , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[57]  F. Dellaert,et al.  Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[58]  Michael Bosse,et al.  Calibrated, Registered Images of an Extended Urban Area , 2003, International Journal of Computer Vision.

[59]  Marc Levoy,et al.  Interactive design of multi-perspective images for visualizing urban landscapes , 2004, IEEE Visualization 2004.

[60]  Christian Früh,et al.  An Automated Method for Large-Scale, Ground-Based City Model Acquisition , 2004, International Journal of Computer Vision.

[61]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[62]  Karl Johan Åström,et al.  Solutions to Minimal Generalized Relative Pose Problems , 2005 .

[63]  Michael Wand,et al.  FIRST EXPERIENCES WITH A MOBILE PLATFORM FOR FLEXIBLE 3 D MODEL ACQUISITION IN INDOOR AND OUTDOOR ENVIRONMENTS – THE WÄGELE 1 , 2005 .

[64]  David Nistér,et al.  Preemptive RANSAC for live structure and motion estimation , 2005, Machine Vision and Applications.

[65]  E. Mikhail,et al.  Manual of Photogrammetry, 5th Edition , 2006 .

[66]  Alexandru Tupan,et al.  Triangulation , 1997, Comput. Vis. Image Underst..

[67]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[68]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[69]  Frank Dellaert,et al.  Line-Based Structure from Motion for Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[70]  Jan-Michael Frahm,et al.  Towards Urban 3D Reconstruction from Video , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[71]  Michael Goesele,et al.  Multi-View Stereo Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[72]  Luc Van Gool,et al.  Fast Compact City Modeling for Navigation Pre-Visualization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[73]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[74]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[75]  Frank Dellaert,et al.  Inferring Temporal Order of Images From 3D Structure , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[76]  Ruigang Yang,et al.  Gain Adaptive Real-Time Stereo Streaming , 2007 .

[77]  Jan-Michael Frahm,et al.  Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Jan-Michael Frahm,et al.  Feature tracking and matching in video using programmable graphics hardware , 2007, Machine Vision and Applications.

[79]  Neill W Campbell,et al.  IEEE International Conference on Computer Vision and Pattern Recognition , 2008 .