Keyframe-Based Camera Relocalization Method Using Landmark and Keypoint Matching

Camera relocalization is a challenging task, especially based on the sparse 3D map or keyframes. In this paper, we present an accurate method for RGB camera relocalization in the case of a very sparse 3D map built by limited keyframes. The core of our approach is a top-to-down feature matching strategy to provide a set of accurate 2D-to-3D matches. Specifically, we first use the landmark-based place recognition method to generate from keyframes the images similar to the current view along with the set of pairwise matched landmarks. This step constrains the 3D model points that can be matched with the current view. Then, the points are matched within the landmark pairs and combined afterward. This is in contrast to the conventional methods of feature matching that typically match points between the entire images and the whole 3D map, which, as a result, may not be robust to large viewpoint changes, the main challenge of the relocalization based on the sparse map. After feature matching, the camera pose is calculated by an efficient novel Perspective-n-Points (PnP) algorithm. We conduct experiments on challenging datasets to demonstrate that the camera poses estimated by our method based on the sparse 3D point cloud are more accurate than the classical methods using the dense map or a large number of training images.

[1]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[2]  Torsten Sattler,et al.  InLoc: Indoor Visual Localization with Dense Matching and View Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Francesc Moreno-Noguer,et al.  Very Fast Solution to the PnP Problem with Algebraic Outlier Rejection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Torsten Sattler,et al.  Hyperpoints and Fine Vocabularies for Large-Scale Location Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Daniel Cremers,et al.  LDSO: Direct Sparse Odometry with Loop Closure , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Juan D. Tardós,et al.  Fast relocalisation and loop closing in keyframe-based SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Andrew W. Fitzgibbon,et al.  Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ben Glocker,et al.  Real-time RGB-D camera relocalization , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Michael Milford,et al.  Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free , 2015, Robotics: Science and Systems.

[13]  Niko Sünderhauf,et al.  Appearance change prediction for long-term navigation across seasons , 2013, 2013 European Conference on Mobile Robots.

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Ian D. Reid,et al.  Automatic Relocalization and Loop Closing for Real-Time Monocular SLAM , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Dongbing Gu,et al.  Indoor Relocalization in Challenging Environments With Dual-Stream Convolutional Neural Networks , 2018, IEEE Transactions on Automation Science and Engineering.

[17]  Paul Newman,et al.  Shady dealings: Robust, long-term visual localisation using illumination invariance , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Shilin Zhou,et al.  BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition , 2018, Auton. Robots.

[20]  Niko Sünderhauf,et al.  On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Roberto Cipolla,et al.  Geometric Loss Functions for Camera Pose Regression with Deep Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Torsten Sattler,et al.  Efficient & Effective Prioritized Matching for Large-Scale Image-Based Localization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[24]  Andrew W. Fitzgibbon,et al.  Multi-output Learning for Camera Relocalization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Lixin Fan,et al.  Real-time SLAM relocalization with online learning of binary feature indexing , 2017, Machine Vision and Applications.

[26]  Masatoshi Okutomi,et al.  ASPnP: An Accurate and Scalable Solution to the Perspective-n-Point Problem , 2013, IEICE Trans. Inf. Syst..

[27]  Roberto Cipolla,et al.  Modelling uncertainty in deep learning for camera relocalization , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Michael F. Cohen,et al.  Real-time image-based 6-DOF localization in large-scale environments , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[30]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[31]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[32]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[33]  Daniel P. Huttenlocher,et al.  Location Recognition Using Prioritized Feature Matching , 2010, ECCV.

[34]  Jun Li,et al.  Towards A Deep Insight into Landmark-based Visual Place Recognition: Methodology and Practice , 2018, ArXiv.

[35]  Peer Neubert,et al.  Local region detector + CNN based landmarks for practical place recognition in changing environments , 2015, 2015 European Conference on Mobile Robots (ECMR).

[36]  Huaimin Wang,et al.  Using collaborative sharing on cloud for fast relocalization in keyframe-based SLAM , 2017, 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR).

[37]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Daniel Cremers,et al.  Image-Based Localization Using LSTMs for Structured Feature Correlation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Ben Glocker,et al.  Real-Time RGB-D Camera Relocalization via Randomized Ferns for Keyframe Encoding , 2015, IEEE Transactions on Visualization and Computer Graphics.

[40]  Yubin Kuang,et al.  Revisiting the PnP Problem: A Fast, General and Optimal Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Wolfram Burgard,et al.  Robust Visual Localization Across Seasons , 2018, IEEE Transactions on Robotics.

[42]  Ming-Ming Cheng,et al.  GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence , 2019, International Journal of Computer Vision.

[43]  Pascal Fua,et al.  Worldwide Pose Estimation Using 3D Point Clouds , 2012, ECCV.

[44]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .