Efficient 3D Scene Modeling and Mosaicing

This book proposes a complete pipeline for monocular (single camera) based 3D mapping of terrestrial and underwater environments. The aim is to provide a solution to large-scale scene modeling that is both accurate and efficient. To this end, we have developed a novel Structure from Motion algorithm that increases mapping accuracy by registering camera views directly with the maps. The camera registration uses a dual approach that adapts to the type of environment being mapped. In order to further increase the accuracy of the resulting maps, a new method is presented, allowing detection of images corresponding to the same scene region (crossovers). Crossovers then used in conjunction with global alignment methods in order to highly reduce estimation errors, especially when mapping large areas. Our method is based on Visual Bag of Words paradigm (BoW), offering a more efficient and simpler solution by eliminating the training stage, generally required by state of the art BoW algorithms. Also, towards developing methods for efficient mapping of large areas (especially with costs related to map storage, transmission and rendering in mind), an online 3D model simplification algorithm is proposed. This new algorithm presents the advantage of selecting only those vertices that are geometrically representative for the scene.

[1]  Touradj Ebrahimi,et al.  MESH: measuring errors between surfaces using the Hausdorff distance , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[2]  Sebastian Thrun,et al.  FastSLAM: A Scalable Method for the Simultaneous Localization and Mapping Problem in Robotics , 2007 .

[3]  Duane C. Brown,et al.  Close-Range Camera Calibration , 1971 .

[4]  Matthew A. Brown,et al.  Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[5]  Michael H. Bowling,et al.  Subjective Localization with Action Respecting Embedding , 2005, ISRR.

[6]  Shahriar Negahdaripour,et al.  Identification of Suitable Interest Points Using Geometric and Photometric Cues in Motion Video for Efficient 3-D Environmental Modeling , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[7]  Paul A. Beardsley,et al.  Navigation using Affine Structure from Motion , 1994, ECCV.

[8]  D. F. Hays,et al.  Table of Integrals, Series, and Products , 1966 .

[9]  Richard Szeliski,et al.  Creating full view panoramic image mosaics and environment maps , 1997, SIGGRAPH.

[10]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[11]  Alexandre Bernardino,et al.  Mosaic-based navigation for autonomous underwater vehicles , 2003 .

[12]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision , 2004 .

[13]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[14]  Patrick M Kelly An Algorithm for Merging Hyperellipsoidal Clusters , 1994 .

[15]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[16]  Lina María Paz,et al.  Large-Scale 6-DOF SLAM With Stereo-in-Hand , 2008, IEEE Transactions on Robotics.

[17]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[18]  Richard I. Hartley,et al.  Minimal Solutions for Panoramic Stitching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Shahriar Negahdaripour,et al.  On reconstruction of 3-D volumetric models of reefs and benthic structures from image sequences of a stereo rig , 2003, Oceans 2003. Celebrating the Past ... Teaming Toward the Future (IEEE Cat. No.03CH37492).

[20]  Peter F. Sturm,et al.  A Factorization Based Algorithm for Multi-Image Projective Structure and Motion , 1996, ECCV.

[21]  Ryan M. Eustice,et al.  Large-area visually augmented navigation for autonomous underwater vehicles , 2005 .

[22]  S. Lazebnik,et al.  Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study , 2005 .

[23]  John Oliensis,et al.  A Multi-Frame Structure-from-Motion Algorithm under Perspective Projection , 1999, International Journal of Computer Vision.

[24]  Ben J. A. Kröse,et al.  A probabilistic model for appearance-based robot localization , 2001, Image Vis. Comput..

[25]  Manolis I. A. Lourakis,et al.  Camera Self-Calibration Using the Singular Value Decomposition of the Fundamental Matrix: From Point Correspondences to 3D Measurements , 1999 .

[26]  Xu,et al.  A new algorithm of sub-pixels image matching , 2004 .

[27]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[28]  X. Cufi,et al.  On the way to solve lighting problems in underwater imaging , 2002, OCEANS '02 MTS/IEEE.

[29]  R. Eustice,et al.  Large area 3D reconstructions from underwater surveys , 2004, Oceans '04 MTS/IEEE Techno-Ocean '04 (IEEE Cat. No.04CH37600).

[30]  Jean-Arcady Meyer,et al.  Incremental vision-based topological SLAM , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Hanumant Singh,et al.  Toward large-area mosaicing for underwater scientific applications , 2003 .

[32]  Peter Shirley,et al.  Realistic ray tracing , 2000 .

[33]  Michael Garland,et al.  Simplifying surfaces with color and texture using quadric error metrics , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[34]  Roland Siegwart,et al.  Deriving and matching image fingerprint sequences for mobile robot localization , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[35]  K. D. Moore,et al.  Underwater Optical Imaging: Status and Prospects , 2001 .

[36]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Anders Heyden,et al.  An iterative factorization method for projective structure and motion from image sequences , 1999, Image Vis. Comput..

[38]  Hanumant Singh,et al.  Visually Navigating the RMS Titanic with SLAM Information Filters , 2005, Robotics: Science and Systems.

[39]  Simon Lacroix,et al.  Simultaneous Localization and Mapping with Stereovision , 2003, ISRR.

[40]  S. Negahdaripour,et al.  Robust optical flow estimation using underwater color images , 2003, Oceans 2003. Celebrating the Past ... Teaming Toward the Future (IEEE Cat. No.03CH37492).

[41]  S.M. Rock,et al.  An Operational Real-Time Large-Scale Visual Mosaicking and Navigation System , 2006, OCEANS 2006.

[42]  Richard Szeliski,et al.  Image mosaicing for tele-reality applications , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[43]  José A. Castellanos,et al.  Mobile Robot Localization and Map Building: A Multisensor Fusion Approach , 2000 .

[44]  Andrew Zisserman,et al.  Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?" , 2002, ECCV.

[45]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[46]  Zhengyou Zhang,et al.  Flexible camera calibration by viewing a plane from unknown orientations , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[47]  H. Singh,et al.  Advances in large-area photomosaicking underwater , 2004, IEEE Journal of Oceanic Engineering.

[48]  Karel J. Zuiderveld,et al.  Contrast Limited Adaptive Histogram Equalization , 1994, Graphics Gems.

[49]  Olivier Faugeras,et al.  Motion and Structure from Motion in a piecewise Planar Environment , 1988, Int. J. Pattern Recognit. Artif. Intell..

[50]  Carlos González,et al.  A texture-based metric extension for simplification methods , 2007, GRAPP.

[51]  Luc Van Gool,et al.  Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions , 2000, BMVC.

[52]  S. Negahdaripour,et al.  Monocular-based 3-D seafloor reconstruction and ortho-mosaicing by piecewise planar representation , 2005, Proceedings of OCEANS 2005 MTS/IEEE.

[53]  José Santos-Victor,et al.  Underwater Video Mosaics as Visual Navigation Maps , 2000, Comput. Vis. Image Underst..

[54]  Hugues Hoppe,et al.  View-dependent refinement of progressive meshes , 1997, SIGGRAPH.

[55]  Xavier Cufí,et al.  Recovering Camera Motion in a Sequence of Underwater Images through Mosaicking , 2003, IbPRIA.

[56]  Shahriar Negahdaripour,et al.  Stereovision imaging on submersible platforms for 3-D mapping of benthic habitats and sea-floor structures , 2003 .

[57]  Ching Y. Suen,et al.  Thinning Methodologies - A Comprehensive Survey , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Manolis I. A. Lourakis,et al.  SBA: A software package for generic sparse bundle adjustment , 2009, TOMS.

[59]  Hanumant Singh,et al.  Exactly Sparse Delayed-State Filters for View-Based SLAM , 2006, IEEE Transactions on Robotics.

[60]  B. D. Lucas Generalized image matching by the method of differences , 1985 .

[61]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[62]  Andrew W. Fitzgibbon,et al.  Damped Newton algorithms for matrix factorization with missing data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[63]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[64]  Robert M. Haralick Propagating covariance in computer vision , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[65]  Xavier Armangué,et al.  Overall view regarding fundamental matrix estimation , 2003, Image Vis. Comput..

[66]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[67]  Kenichi Kanatani Structure from Motion Without Correspondence: General Principle , 1985, IJCAI.

[68]  Shahriar Negahdaripour,et al.  Planar homography: accuracy analysis and applications , 2005, IEEE International Conference on Image Processing 2005.

[69]  PAUL D. SAMPSON,et al.  Fitting conic sections to "very scattered" data: An iterative refinement of the bookstein algorithm , 1982, Comput. Graph. Image Process..

[70]  Chang-Doo Kee,et al.  Multi-Range approach of stereo vision for mobile robot navigation in uncertain environments , 2003 .

[71]  Shahriar Negahdaripour,et al.  Direct estimation of motion from sea floor images for automatic station-keeping of submersible platforms , 1999 .

[72]  Hugues Hoppe,et al.  Progressive meshes , 1996, SIGGRAPH.

[73]  Hongbin Zha,et al.  Vision-based Global Localization Using a Visual Vocabulary , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[74]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[75]  Xavier Cufí,et al.  Augmented state Kalman filtering for AUV navigation , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[76]  Olivier D. Faugeras,et al.  Some Properties of the E Matrix in Two-View Motion Estimation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[77]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[78]  Andrew W. Fitzgibbon,et al.  Automatic Camera Recovery for Closed or Open Image Sequences , 1998, ECCV.

[79]  Minas E. Spetsakis,et al.  A multi-frame approach to visual motion perception , 1991, International Journal of Computer Vision.

[80]  Philip F. McLauchlan,et al.  Image mosaicing using sequential bundle adjustment , 2002, Image Vis. Comput..

[81]  Jean-Arcady Meyer,et al.  Real-time visual loop-closure detection , 2008, 2008 IEEE International Conference on Robotics and Automation.

[82]  Shouqian Sun,et al.  Texture Information Driven Triangle Mesh Simplification , 2005, Computer Graphics and Imaging.

[83]  Alberto Sanfeliu,et al.  Vision-based loop closing for delayed state robot mapping , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[84]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[85]  Naokazu Yokoya,et al.  Video Mosaicing for Document Imaging , 2007 .

[86]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[87]  Long Quan,et al.  Match Propagation for Image-Based Modeling and Rendering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[88]  O. D. Faugeras,et al.  Camera Self-Calibration: Theory and Experiments , 1992, ECCV.

[89]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[90]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[91]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[92]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[93]  Oscar R Pizarro,et al.  Large scale structure from motion for autonomous underwater vehicle surveys , 2004 .

[94]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[95]  Hanumant Singh,et al.  Towards High-resolution Imaging from Underwater Vehicles , 2007, Int. J. Robotics Res..

[96]  George Wolberg,et al.  Robust image registration using log-polar transform , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[97]  Hanumant Singh,et al.  Visually augmented navigation in an unstructured environment using a delayed state history , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[98]  Larry H. Matthies,et al.  Error Modelling in Stereo Navigation , 1986, FJCC.

[99]  Xavier Armangué,et al.  A comparative review of camera calibrating methods with accuracy evaluation , 2002, Pattern Recognit..

[100]  Tony DeRose,et al.  Multiresolution analysis of arbitrary meshes , 1995, SIGGRAPH.

[101]  David Capel,et al.  Image Mosaicing and Super-resolution , 2004, Distinguished Dissertations.

[102]  Harpreet S. Sawhney,et al.  Robust Video Mosaicing through Topology Inference and Local to Global Alignment , 1998, ECCV.

[103]  William E. Lorensen,et al.  Decimation of triangle meshes , 1992, SIGGRAPH.

[104]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[105]  Sebastian Thrun,et al.  FastSLAM 2.0: An Improved Particle Filtering Algorithm for Simultaneous Localization and Mapping that Provably Converges , 2003, IJCAI.

[106]  S. Negahdaripour,et al.  3-D motion and depth estimation from sea-floor images for mosaic-based station-keeping and navigation of ROVs/AUVs and high-resolution sea-floor mapping , 1998, Proceedings of the 1998 Workshop on Autonomous Underwater Vehicles (Cat. No.98CH36290).

[107]  Marc Carreras,et al.  Estimating the motion of an underwater robot from a monocular image sequence , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[108]  Noboru Ohnishi,et al.  Recovery of Motion and Structure from Optical Flow under Perspective Projection by Solving Linear Simultaneous Equations , 1998, ACCV.

[109]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[110]  B. N. Chatterji,et al.  An FFT-based technique for translation, rotation, and scale-invariant image registration , 1996, IEEE Trans. Image Process..

[111]  Shahriar Negahdaripour,et al.  Revised Definition of Optical Flow: Integration of Radiometric and Geometric Cues for Dynamic Scene Analysis , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[112]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[113]  Jarek Rossignac,et al.  Multi-resolution 3D approximations for rendering complex scenes , 1993, Modeling in Computer Graphics.

[114]  Kok-Lim Low,et al.  Model simplification using vertex-clustering , 1997, SI3D.

[115]  Paul Beaudet,et al.  Rotationally invariant image operators , 1978 .

[116]  Josef Sivic Efficient visual search of images videos , 2006 .

[117]  Pere Ridao,et al.  Towards a real-time vision-based navigation system for a small-class UUV , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[118]  Joan Batlle,et al.  Detection of matchings in a sequence of underwater images through texture analysis , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[119]  H. C. Longuet-Higgins,et al.  A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[120]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[121]  Mei Han,et al.  Reconstruction of a Scene with Multiple Linearly Moving Objects , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[122]  Maarten Vergauwen,et al.  Web-based 3D Reconstruction Service , 2006, Machine Vision and Applications.

[123]  Olivier D. Faugeras,et al.  Relative 3D positioning and 3D convex hull computation from a weakly calibrated stereo pair , 1993, 1993 (4th) International Conference on Computer Vision.

[124]  Serge J. Belongie,et al.  A Feature-based Approach for Dense Segmentation and Estimation of Large Disparity Motion , 2006, International Journal of Computer Vision.

[125]  Peter Auer,et al.  Weak Hypotheses and Boosting for Generic Object Detection and Recognition , 2004, ECCV.

[126]  Paul Newman,et al.  Probabilistic Appearance Based Navigation and Loop Closing , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[127]  Tomás Pajdla,et al.  3D reconstruction by fitting low-rank matrices with missing data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[128]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[129]  Yuri Rzhanov,et al.  Seafloor video mapping: modeling, algorithms, apparatus , 2002, Proceedings. International Conference on Image Processing.

[130]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[131]  Y. Rzhanov,et al.  Deep-sea Geo-referenced Video Mosaics , 2006, OCEANS 2006.

[132]  Marc Carreras,et al.  ROV-Aided Dam Inspection: Practical Results , 2003 .

[133]  Djemel Ziou,et al.  Detection of line junctions and line terminations using curvilinear features , 2000, Pattern Recognit. Lett..

[134]  Rafael Garcia,et al.  Georeferenced Photo-Mosaicing of the Seafloor , 2005 .

[135]  Janne Heikkilä,et al.  A four-step camera calibration procedure with implicit image correction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[136]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[137]  Shahriar Negahdaripour,et al.  Global alignment of sensor positions with noisy motion measurements , 2004, IEEE Transactions on Robotics.

[138]  Shahriar Negahdaripour,et al.  On robustness and localization accuracy of optical flow computation from color imagery , 2004 .

[139]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[140]  Aníbal Ollero,et al.  Homography Based Kalman Filter for Mosaic Building. Applications to UAV position estimation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.