Development of a SGM-based multi-view reconstruction framework for aerial imagery

Advances in the technology of digital airborne camera systems allow for the observation of surfaces with sampling rates in the range of a few centimeters. In combination with novel matching approaches, which estimate depth information for virtually every pixel, surface reconstructions of impressive density and precision can be generated. Therefore, image based surface generation meanwhile is a serious alternative to LiDAR based data collection for many applications. Surface models serve as primary base for geographic products as for example map creation, production of true-ortho photos or visualization purposes within the framework of virtual globes. The goal of the presented theses is the development of a framework for the fully automatic generation of 3D surface models based on aerial images both standard nadir as well as oblique views. This comprises several challenges. On the one hand dimensions of aerial imagery is considerable and the extend of the areas to be reconstructed can encompass whole countries. Beside scalability of methods this also requires decent processing times and efficient handling of the given hardware resources. Moreover, beside high precision requirements, a high degree of automation has to be guaranteed to limit manual interaction as much as possible. Due to the advantages of scalability, a stereo method is utilized in the presented thesis. The approach for dense stereo is based on an adapted version of the semi global matching (SGM) algorithm. Following a hierarchical approach corresponding image regions and meaningful disparity search ranges are identified. It will be verified that, dependent on undulations of the scene, time and memory demands can be reduced significantly, by up to 90% within some of the conducted tests. This enables the processing of aerial datasets on standard desktop machines in reasonable times even for large fields of depth. Stereo approaches generate disparity or depth maps, in which redundant depth information is available. To exploit this redundancy, a method for the refinement of stereo correspondences is proposed. Thereby redundant observations across stereo models are identified, checked for geometric consistency and their reprojection error is minimized. This way outliers are removed and precision of depth estimates is improved. In order to generate consistent surfaces, two algorithms for depth map fusion were developed. The first fusion strategy aims for the generation of 2.5D height models, also known as digital surface models (DSM). The proposed method improves existing methods regarding quality in areas of depth discontinuities, for example at roof edges. Utilizing benchmarks designed for the evaluation of image based DSM generation we show that the developed approaches favorably compare to state-of-the-art algorithms and that height precisions of few GSDs can be achieved. Furthermore, methods for the derivation of meshes based on DSM data are discussed. The fusion of depth maps for 3D scenes, as e.g. frequently required during evaluation of high resolution oblique aerial images in complex urban environments, demands for a different approach since scenes can in general not be represented as height fields. Moreover, depths across depth maps possess varying precision and sampling rates due to variances in image scale, errors in orientation and other effects. Within this thesis a median-based fusion methodology is proposed. By using geometry-adaptive triangulation of depth maps depth-wise normals are extracted and, along the point coordinates are filtered and fused using tree structures. The output of this method are oriented points which then can be used to generate meshes. Precision and density of the method will be evaluated using established multi-view benchmarks. Beside the capability to process close range datasets, results for large oblique airborne data sets will be presented. The report closes with a summary, discussion of limitations and perspectives regarding improvements and enhancements. The implemented algorithms are core elements of the commercial software package SURE, which is freely available for scientific purposes.

[1]  H. Maas Automatic DEM generation by multi-image feature based matching , 1996 .

[2]  Larry H. Matthies,et al.  Error analysis of a real-time stereo system , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Michael Goesele,et al.  Image-based rendering for scenes with reflections , 2012, ACM Trans. Graph..

[4]  Wolfgang Förstner,et al.  Fish-Eye-Stereo Calibration and Epipolar Rectification , 2005 .

[5]  Stefan K. Gehrig,et al.  A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching , 2009, ICVS.

[6]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[7]  Vincent Lepetit,et al.  A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[9]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Marc Levoy,et al.  Zippered polygon meshes from range images , 1994, SIGGRAPH.

[11]  Horst Bischof,et al.  EFFICIENT AND GLOBALLY OPTIMAL MULTI VIEW DENSE MATCHING FOR AERIAL IMAGES , 2012 .

[12]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Emanuele Trucco,et al.  A compact algorithm for rectification of stereo pairs , 2000, Machine Vision and Applications.

[14]  Reinhard Klette,et al.  Evaluation of a New Coarse-to-Fine Strategy for Fast Semi-Global Stereo Matching , 2011, PSIVT.

[15]  Josef Kittler,et al.  Discrete relaxation , 1990, Pattern Recognit..

[16]  Xing Mei,et al.  Stereo Matching with Reliable Disparity Propagation , 2011, 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission.

[17]  Olga Veksler,et al.  Markov random fields with efficient approximations , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[18]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[20]  F. Bethmann,et al.  Semi-Global Matching in Object Space , 2015 .

[21]  Sebastian Wuttke,et al.  Quality preserving fusion of 3D triangle meshes , 2012, 2012 15th International Conference on Information Fusion.

[22]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[23]  N. Pfeifer,et al.  DERIVATION OF DIGITAL TERRAIN MODELS IN THE SCOP++ ENVIRONMENT , 2001 .

[24]  Larry H. Matthies,et al.  Attenuating stereo pixel-locking via affine window adaptation , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[25]  Oge Marques,et al.  Stereo depth with a Unified Architecture GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[26]  Masatoshi Okutomi,et al.  An analysis of sub-pixel estimation error on area-based image matching , 2002, 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628).

[27]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Florent Lafarge,et al.  LOD Generation for Urban Scenes , 2015, ACM Trans. Graph..

[29]  A. Gruen ADAPTIVE LEAST SQUARES CORRELATION: A POWERFUL IMAGE MATCHING TECHNIQUE , 1985 .

[30]  Michael Goesele,et al.  Multi-View Stereo Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Hoang-Hiep Vu,et al.  Large-Scale and High-Quality Multi-View Stereo , 2012 .

[32]  Charles T. Loop,et al.  Computing rectifying homographies for stereo vision , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[33]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[34]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Kiriakos N. Kutulakos,et al.  What Do N Photographs Tell Us about 3D Shape , 1998 .

[36]  Simon Fuhrmann,et al.  Fusion of depth maps with multiple scales , 2011, ACM Trans. Graph..

[37]  Heiko Hirschmüller,et al.  A TV Prior for High-Quality Local Multi-view Stereo Reconstruction , 2014, 2014 2nd International Conference on 3D Vision.

[38]  Roberto Cipolla,et al.  Reconstructing relief surfaces , 2008, Image and Vision Computing.

[39]  Philip L. Davidson,et al.  Real-time stereo vision using semi-global matching on programmable graphics hardware , 2006, SIGGRAPH '06.

[40]  Renato Pajarola,et al.  QuadTIN: quadtree based triangulated irregular networks , 2002, IEEE Visualization, 2002. VIS 2002..

[41]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[42]  Masatoshi Okutomi,et al.  Precise Sub-Pixel Estimation on Area-Based Matching , 2001, ICCV.

[43]  Peter Pirsch,et al.  Real-time stereo vision system using semi-global matching disparity estimation: Architecture and FPGA-implementation , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[44]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Olivier D. Faugeras,et al.  Variational principles, surface evolution, PDEs, level set methods, and the stereo problem , 1998, IEEE Trans. Image Process..

[47]  Jean-Philippe Pons,et al.  Towards high-resolution large-scale multi-view stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Zhang Li,et al.  Automatic DTM Generation from Three-Line-Scanner (TLS) Images , 2002 .

[49]  Olga Veksler Graph Cut Based Optimization for MRFs with Truncated Convex Priors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Geert Verhoeven,et al.  Taking computer vision aloft – archaeological three‐dimensional reconstructions from aerial photographs with photoscan , 2011 .

[51]  Jean-Philippe Pons,et al.  Minimizing the Multi-view Stereo Reprojection Error for Triangular Surface Meshes , 2008, BMVC.

[52]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Günther Greiner,et al.  On Floating‐Point Normal Vectors , 2010, Comput. Graph. Forum.

[55]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[56]  William T. Freeman,et al.  Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[57]  Tim Bodenmüller,et al.  Streaming surface reconstruction from real time 3D-measurements , 2009 .

[58]  M. Pierrot-Deseilligny,et al.  A MULTIRESOLUTION AND OPTIMIZATION-BASED IMAGE MATCHING APPROACH : AN APPLICATION TO SURFACE RECONSTRUCTION FROM SPOT 5-HRS STEREO IMAGERY , 2006 .

[59]  Jean-Philippe Pons,et al.  Robust and Efficient Surface Reconstruction From Range Data , 2009, Comput. Graph. Forum.

[60]  Leif Kobbelt,et al.  OpenMesh: A Generic and Efficient Polygon Mesh Data Structure , 2002 .

[61]  Takeo Kanade,et al.  A multiple-baseline stereo , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[62]  Alan L. Yuille,et al.  Occlusions and binocular stereo , 1992, International Journal of Computer Vision.

[63]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[64]  Gabriel Taubin,et al.  The ball-pivoting algorithm for surface reconstruction , 1999, IEEE Transactions on Visualization and Computer Graphics.

[65]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[66]  Michael Goesele,et al.  Let There Be Color! Large-Scale Texturing of 3D Reconstructions , 2014, ECCV.

[67]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[68]  Geoffrey Egnal,et al.  Detecting Binocular Half-Occlusions: Empirical Comparisons of Five Approaches , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Hiroshi Ishikawa,et al.  Exact Optimization for Markov Random Fields with Convex Priors , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[70]  Jean-Philippe Pons,et al.  Efficient Multi-View Reconstruction of Large-Scale Scenes using Interest Points, Delaunay Triangulation and Graph Cuts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[71]  Renato Pajarola Large scale terrain visualization using the restricted quadtree triangulation , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[72]  Tomás Pajdla,et al.  Multi-view reconstruction preserving weakly-supported surfaces , 2011, CVPR 2011.

[73]  William H. Press,et al.  Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .

[74]  Reinhard Klette,et al.  Iterative Semi-Global Matching for Robust Driver Assistance Systems , 2012, ACCV.

[75]  Vladimir Kolmogorov,et al.  Computing visual correspondence with occlusions using graph cuts , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[76]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[77]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[78]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[79]  Sébastien Roy,et al.  Geo-consistency for wide multi-camera stereo , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[80]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[81]  Richard Szeliski,et al.  Bundle Adjustment in the Large , 2010, ECCV.

[82]  Reinhard Koch,et al.  A simple and efficient rectification method for general motion , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[83]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[84]  Horst Bischof,et al.  A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[85]  Ruigang Yang,et al.  Dealing with textureless regions and specular highlights - a progressive space carving scheme using a novel photo-consistency measure , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[86]  Ingemar J. Cox,et al.  A Maximum Likelihood Stereo Algorithm , 1996, Comput. Vis. Image Underst..

[87]  Michael Garland,et al.  Surface simplification using quadric error metrics , 1997, SIGGRAPH.

[88]  Steven M. Seitz,et al.  Photorealistic Scene Reconstruction by Voxel Coloring , 1997, International Journal of Computer Vision.

[89]  Jan-Michael Frahm,et al.  Real-Time Visibility-Based Fusion of Depth Maps , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[90]  Hans-Peter Seidel,et al.  Interactive multi-resolution modeling on arbitrary meshes , 1998, SIGGRAPH.

[91]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[92]  N. Haala The Landscape of Dense Image Matching Algorithms , 2013 .

[93]  R. Pajarola Overview of Quadtree-based Terrain Triangulation and Visualization , 2002 .

[94]  N BelhumeurPeter A Bayesian approach to binocular stereopsis , 1996 .

[95]  Emmanuel P. Baltsavias,et al.  Multiphoto geometrically constrained matching , 1991 .

[96]  Tianli Yu,et al.  Shape and View Independent Reflectance Map from Multiple Views , 2004, International Journal of Computer Vision.

[97]  J. Sethian,et al.  FRONTS PROPAGATING WITH CURVATURE DEPENDENT SPEED: ALGORITHMS BASED ON HAMILTON-JACOB1 FORMULATIONS , 2003 .

[98]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[99]  Ines Ernst,et al.  Mutual Information Based Semi-Global Stereo Matching on the GPU , 2008, ISVC.

[100]  Reinhard Koch,et al.  3D Structure from Multiple Images of Large-Scale Environments , 1998, Lecture Notes in Computer Science.

[101]  Ingemar J. Cox,et al.  A maximum-flow formulation of the N-camera stereo correspondence problem , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[102]  Thomas Malzbender,et al.  A Survey of Methods for Volumetric Scene Reconstruction from Photographs , 2001, VG.

[103]  Daniel Cremers,et al.  Global Solutions of Variational Models with Convex Regularization , 2010, SIAM J. Imaging Sci..

[104]  Reinhard Koch,et al.  Multi Viewpoint Stereo from Uncalibrated Video Sequences , 1998, ECCV.

[105]  J.-S. Hsia,et al.  A Method for the Automated Production of Digital Terrain Models Using a Combination of Feature Points, Grid Points, and Filling Back Points , 1999 .

[106]  Daniel P. Huttenlocher,et al.  Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[107]  Charles Hansen,et al.  Rectification of images for binocular and trinocular stereovision , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[108]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[109]  Morgan McGuire,et al.  A Survey of Efficient Representations for Independent Unit Vectors , 2014 .

[110]  Thomas O. Binford,et al.  Depth from Edge and Intensity Based Stereo , 1981, IJCAI.

[111]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[112]  C. Zach Fast and High Quality Fusion of Depth Maps , 2008 .

[113]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.