Will It Last? Learning Stable Features for Long-Term Visual Localization

An increasing number of simultaneous localization and mapping (SLAM) systems are using appearance-based localization to improve the quality of pose estimates. However, with the growing time-spans and size of the areas we want to cover, appearance-based maps are often becoming too large to handle and are consisting of features that are not always reliable for localization purposes. This paper presents a method for selecting map features that are persistent over time and thus suited for long-term localization. Our methodology relies on a CNN classifier based on image patches and depth maps for recognizing which features are suitable for life-long matchability. Thus, the classifier not only considers the appearance of a feature but also takes into account its expected lifetime. As a result, our feature selection approach produces more compact maps with a high fraction of temporally-stable features compared to the current state-of-the-art, while rejecting unstable features that typically harm localization. Our approach is validated on indoor and outdoor datasets, that span over a period of several months.

[1]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Paul Newman,et al.  Made to measure: Bespoke landmarks for 24-hour, all-weather localisation with a camera , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Winston Churchill,et al.  Experience-based navigation for long-term localisation , 2013, Int. J. Robotics Res..

[4]  Panu Turcot,et al.  Better matching with fewer features: The selection of useful features in large database recognition problems , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[5]  Torsten Sattler,et al.  Scalable 6-DOF Localization on Mobile Devices , 2014, ECCV.

[6]  Yaser Sheikh,et al.  3D Point Cloud Reduction Using Mixed-Integer Quadratic Programming , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Joseph G. Ibrahim,et al.  Bayesian Survival Analysis , 2004 .

[8]  Michael Bosse,et al.  Keep it brief: Scalable creation of compressed localization maps , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[10]  Tomás Pajdla,et al.  Avoiding Confusing Features in Place Recognition , 2010, ECCV.

[11]  Kurt Konolige,et al.  Towards lifelong visual maps , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Michael Bosse,et al.  Placeless Place-Recognition , 2014, 2014 2nd International Conference on 3D Vision.

[13]  Guang-Zhong Yang,et al.  Generative Methods for Long-Term Place Recognition in Dynamic Scenes , 2013, International Journal of Computer Vision.

[14]  Michael Bosse,et al.  Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization , 2015, Robotics: Science and Systems.

[15]  Jianliang Tang,et al.  Complete Solution Classification for the Perspective-Three-Point Problem , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[17]  John J. Leonard,et al.  Towards lifelong feature-based mapping in semi-static environments , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[18]  A. Verri,et al.  A compact algorithm for rectification of stereo pairs , 2000 .

[19]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[20]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[23]  Reinhard Koch,et al.  A simple and efficient rectification method for general motion , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Achim J. Lilienthal,et al.  SIFT, SURF & seasons: Appearance-based long-term localization in outdoor environments , 2010, Robotics Auton. Syst..

[27]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[29]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Jan-Michael Frahm,et al.  Predicting Good Features for Image Geo-Localization Using Per-Bundle VLAD , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[33]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[34]  Paul Newman,et al.  Work smart, not hard: Recalling relevant experiences for vast-scale but time-constrained localisation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Vincent Lepetit,et al.  View-based Maps , 2010, Int. J. Robotics Res..

[36]  Konrad Schindler,et al.  Predicting Matchability , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Paul Newman,et al.  Learning place-dependant features for long-term vision-based localisation , 2015, Auton. Robots.

[38]  Ryan M. Eustice,et al.  University of Michigan North Campus long-term vision and lidar dataset , 2016, Int. J. Robotics Res..