Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization

Accurately estimating a robot's pose relative to a global scene model and precisely tracking the pose in real-time is a fundamental problem for navigation and obstacle avoidance tasks. Due to the computational complexity of localization against a large map and the memory consumed by the model, state-of-the-art approaches are either limited to small workspaces or rely on a server-side system to query the global model while tracking the pose locally. The latter approaches face the problem of smoothly integrating the server's pose estimates into the trajectory computed locally to avoid temporal discontinuities. In this paper, we demonstrate that large-scale, real-time pose estimation and tracking can be performed on mobile platforms with limited resources without the use of an external server. This is achieved by employing map and descriptor compression schemes as well as efficient search algorithms from computer vision. We derive a formulation for integrating the global pose information into a local state estimator that produces much smoother trajectories than current approaches. Through detailed experiments, we evaluate each of our design choices individually and document its impact on the overall system performance, demonstrating that our approach outperforms state-of-the-art algorithms for localization at scale.

[1]  Dieter Schmalstieg,et al.  Real-time self-localization from panoramic images on mobile devices , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[2]  Torsten Sattler,et al.  Scalable 6-DOF Localization on Mobile Devices , 2014, ECCV.

[3]  Vincent Lepetit,et al.  Thick boundaries in binary space and their influence on nearest-neighbor search , 2012, Pattern Recognit. Lett..

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Yaser Sheikh,et al.  3D Point Cloud Reduction Using Mixed-Integer Quadratic Programming , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[6]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[8]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[9]  Victor Lempitsky,et al.  The inverted multi-index , 2012, CVPR.

[10]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Stergios I. Roumeliotis,et al.  Vision-Aided Inertial Navigation for Spacecraft Entry, Descent, and Landing , 2009, IEEE Transactions on Robotics.

[12]  Torsten Sattler,et al.  Improving Image-Based Localization by Active Correspondence Search , 2012, ECCV.

[13]  Dieter Schmalstieg,et al.  Global Localization from Monocular SLAM on a Mobile Phone , 2014, IEEE Transactions on Visualization and Computer Graphics.

[14]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[15]  Tobias Höllerer,et al.  Wide-area scene mapping for mobile visual tracking , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[16]  Gordon Wyeth,et al.  CAT-SLAM: probabilistic localisation and mapping using a continuous appearance-based trajectory , 2012, Int. J. Robotics Res..

[17]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[18]  Daniel P. Huttenlocher,et al.  Location Recognition Using Prioritized Feature Matching , 2010, ECCV.

[19]  Michael F. Cohen,et al.  Real-time image-based 6-DOF localization in large-scale environments , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Paul Newman,et al.  Closing loops without places , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Torsten Sattler,et al.  Image Retrieval for Image-Based Localization Revisited , 2012, BMVC.

[22]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[23]  David Nistér,et al.  Preemptive RANSAC for live structure and motion estimation , 2005, Machine Vision and Applications.

[24]  Stergios I. Roumeliotis,et al.  C-KLAM: Constrained keyframe-based localization and mapping , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Pascal Fua,et al.  Worldwide Pose Estimation Using 3D Point Clouds , 2012, ECCV.

[26]  Michael Bosse,et al.  Placeless Place-Recognition , 2014, 2014 2nd International Conference on 3D Vision.

[27]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.

[29]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[30]  Michael Bosse,et al.  Keypoint design and evaluation for place recognition in 2D lidar maps , 2009, Robotics Auton. Syst..

[31]  Simon Lacroix,et al.  Probabilistic place recognition with covisibility maps , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[33]  Horst Bischof,et al.  Natural landmark-based monocular localization for MAVs , 2011, 2011 IEEE International Conference on Robotics and Automation.

[34]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[35]  James J. Little,et al.  Vision-based global localization and mapping for mobile robots , 2005, IEEE Transactions on Robotics.

[36]  Gabe Sibley,et al.  Sliding window filter with application to planetary landing , 2010 .

[37]  Fredrik Kahl,et al.  Accurate Localization and Pose Estimation for Large 3D Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Roland Siegwart,et al.  Comparison of nearest-neighbor-search strategies and implementations for efficient shape registration , 2012 .

[39]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[40]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Paul Newman,et al.  Highly scalable appearance-only SLAM - FAB-MAP 2.0 , 2009, Robotics: Science and Systems.

[42]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[43]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[44]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[45]  Anastasios I. Mourikis,et al.  Motion tracking with fixed-lag smoothing: Algorithm and consistency analysis , 2011, 2011 IEEE International Conference on Robotics and Automation.

[46]  Julien Pilet,et al.  Size Matters: Exhaustive Geometric Verification for Image Retrieval Accepted for ECCV 2012 , 2012, ECCV.

[47]  Jian Sun,et al.  Optimized Product Quantization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Dieter Schmalstieg,et al.  Wide area localization on mobile phones , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[49]  Gabe Sibley,et al.  Incremental unsupervised topological place discovery , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[50]  Hujun Bao,et al.  Keyframe-based real-time camera tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.