METRIC SCALE CALCULATION FOR VISUAL MAPPING ALGORITHMS

Abstract. Visual SLAM algorithms allow localizing the camera by mapping its environment by a point cloud based on visual cues. To obtain the camera locations in a metric coordinate system, the metric scale of the point cloud has to be known. This contribution describes a method to calculate the metric scale for a point cloud of an indoor environment, like a parking garage, by fusing multiple individual scale values. The individual scale values are calculated from structures and objects with a-priori known metric extension, which can be identified in the unscaled point cloud. Extensions of building structures, like the driving lane or the room height, are derived from density peaks in the point distribution. The extension of objects, like traffic signs with a known metric size, are derived using projections of their detections in images onto the point cloud. The method is tested with synthetic image sequences of a drive with a front-looking mono camera through a virtual 3D model of a parking garage. It has been shown, that each individual scale value improves either the robustness of the fused scale value or reduces its error. The error of the fused scale is comparable to other recent works.

[1]  Baoxin Li,et al.  Advances in Visual Computing , 2013, Lecture Notes in Computer Science.

[2]  Shuzhi Sam Ge,et al.  Autonomous vehicle positioning with GPS in urban canyon environments , 2001, IEEE Trans. Robotics Autom..

[3]  A. Heyden,et al.  Euclidean reconstruction from constant intrinsic parameters , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[4]  Andreas Möller,et al.  Scale-preserving long-term visual odometry for indoor navigation , 2012, 2012 International Conference on Indoor Positioning and Indoor Navigation (IPIN).

[5]  Andrew W. Fitzgibbon,et al.  Automatic Camera Recovery for Closed or Open Image Sequences , 1998, ECCV.

[6]  Uwe Stilla,et al.  Iterative Calibration of a Vehicle Camera using Traffic Signs Detected by a Convolutional Neural Network , 2018, VEHITS.

[7]  G. Ros,et al.  Visual SLAM for Driverless Cars : A Brief Survey , 2012 .

[8]  Jan-Michael Frahm,et al.  Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems , 2008, 2008 IEEE Workshop on Applications of Computer Vision.

[9]  Simon Lacroix,et al.  Vision-Based SLAM: Stereo and Monocular Approaches , 2007, International Journal of Computer Vision.

[10]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Bahman Soheilian,et al.  MULTI-VIEW 3D CIRCULAR TARGET RECONSTRUCTION WITH UNCERTAINTY ANALYSIS , 2014 .

[12]  Roland Siegwart,et al.  Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Keiichi Abe,et al.  Topological structural analysis of digitized binary images by border following , 1985, Comput. Vis. Graph. Image Process..

[14]  Richard Szeliski,et al.  Alignment of 3D point clouds to overhead images , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[15]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[16]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Mohinder S. Grewal,et al.  Global Positioning Systems, Inertial Navigation, and Integration , 2000 .

[18]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[19]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[20]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[21]  Clark C. Guest,et al.  High Accuracy Monocular SFM and Scale Correction for Autonomous Driving , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Eckehard G. Steinbach,et al.  6DOF decoupled roto-translation alignment of large-scale indoor point clouds , 2017, Comput. Vis. Image Underst..

[23]  Scott M. Sawyer,et al.  Geo-registering 3D point clouds to 2D maps with scan matching and the Hough Transform , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Martin Lauer,et al.  Robust scale estimation for monocular visual odometry using structure from motion and vanishing points , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[26]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[27]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[28]  Carlos Hernandez,et al.  Multi-View Stereo: A Tutorial , 2015, Found. Trends Comput. Graph. Vis..

[29]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .