Levelling the Playing Field: A Comprehensive Comparison of Visual Place Recognition Approaches under Changing Conditions

In recent years there has been significant improvement in the capability of Visual Place Recognition (VPR) methods, building on the success of both hand-crafted and learnt visual features, temporal filtering and usage of semantic scene information. The wide range of approaches and the relatively recent growth in interest in the field has meant that a wide range of datasets and assessment methodologies have been proposed, often with a focus only on precision-recall type metrics, making comparison difficult. In this paper we present a comprehensive approach to evaluating the performance of 10 state-of-the-art recently-developed VPR techniques, which utilizes three standardized metrics: (a) Matching Performance b) Matching Time c) Memory Footprint. Together this analysis provides an up-to-date and widely encompassing snapshot of the various strengths and weaknesses of contemporary approaches to the VPR problem. The aim of this work is to help move this particular research field towards a more mature and unified approach to the problem, enabling better comparison and hence more progress to be made in future research.

[1]  Barbara Caputo,et al.  Visual Servoing to Help Camera Operators Track Better , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Michael Milford,et al.  LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics , 2018, Robotics: Science and Systems.

[3]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[4]  Ana Cristina Murillo,et al.  SURF features for efficient robot localization with omnidirectional images , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[5]  Michael Milford,et al.  Sequence searching with deep-learnt depth for condition- and viewpoint-invariant route-based place recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Javier Civera,et al.  Condition-Invariant Multi-View Place Recognition , 2019, ArXiv.

[7]  Wolfram Burgard,et al.  Robust Visual Localization Across Seasons , 2018, IEEE Transactions on Robotics.

[8]  Kurt Konolige,et al.  CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching , 2008, ECCV.

[9]  Titus Cieslewski,et al.  Data-Efficient Decentralized Visual SLAM , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[11]  Wolfram Burgard,et al.  Robust Visual Robot Localization Across Seasons Using Network Flows , 2014, AAAI.

[12]  Masatoshi Okutomi,et al.  Visual Place Recognition with Repetitive Structures , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Inkyu Sa,et al.  Only look once, mining distinctive landmarks from ConvNet for visual place recognition , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Chunhua Shen,et al.  Cross-Convolutional-Layer Pooling for Image Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[17]  Michael Milford,et al.  Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free , 2015, Robotics: Science and Systems.

[18]  Tom Drummond,et al.  Scalable Monocular SLAM , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[20]  Masatoshi Okutomi,et al.  24/7 Place Recognition by View Synthesis , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[22]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[23]  Gautam Singh Visual Loop Closing using Gist Descriptors in Manhattan World , 2010 .

[24]  Michael Milford,et al.  Convolutional Neural Network-based Place Recognition , 2014, ICRA 2014.

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Michael Warren,et al.  Unaided stereo vision based pose estimation , 2010, ICRA 2010.

[27]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[28]  Andrew Calway,et al.  Visual Place Recognition Using Landmark Distribution Descriptors , 2016, ACCV.

[29]  Gordon Wyeth,et al.  FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day , 2010, 2010 IEEE International Conference on Robotics and Automation.

[30]  Jana Kosecka,et al.  Experiments in place recognition using gist panoramas , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[31]  Ananth Ranganathan,et al.  Towards illumination invariance for visual localization , 2013, 2013 IEEE International Conference on Robotics and Automation.

[32]  Wolfram Burgard,et al.  Semantics-aware visual localization under challenging perceptual conditions , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[34]  Niko Sünderhauf,et al.  On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Guoquan Huang,et al.  Lightweight Unsupervised Deep Loop Closure , 2018, Robotics: Science and Systems.

[36]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[37]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Achim J. Lilienthal,et al.  SIFT, SURF & seasons: Appearance-based long-term localization in outdoor environments , 2010, Robotics Auton. Syst..

[39]  Michael Milford,et al.  Deep learning features at scale for visual place recognition , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Paul Newman,et al.  Scene Signatures: Localised and Point-less Features for Localisation , 2014, Robotics: Science and Systems.

[41]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[43]  Simon Lacroix,et al.  Probabilistic place recognition with covisibility maps , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[44]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[45]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[46]  Lingqiao Liu,et al.  Learning Context Flexible Attention Model for Long-Term Visual Place Recognition , 2018, IEEE Robotics and Automation Letters.

[47]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[48]  Michael Milford,et al.  A Holistic Visual Place Recognition Approach using Lightweight CNNs for Severe ViewPoint and Appearance Changes , 2018, ArXiv.

[49]  Michael Milford,et al.  Multi-Process Fusion: Visual Place Recognition Using Multiple Image Processing Methods , 2019, IEEE Robotics and Automation Letters.

[50]  Takeo Kanade,et al.  Real-time topometric localization , 2012, 2012 IEEE International Conference on Robotics and Automation.

[51]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[52]  Hugh F. Durrant-Whyte,et al.  Simultaneous Localization, Mapping and Moving Object Tracking , 2007, Int. J. Robotics Res..