A Benchmark Comparison of Visual Place Recognition Techniques for Resource-Constrained Embedded Platforms

Visual Place Recognition (VPR) has been a subject of significant research over the last 15 to 20 years. VPR is a fundamental task for autonomous navigation as it enables self-localization within an environment. Although robots are often equipped with resource-constrained hardware, the computational requirements of and effects on VPR techniques have received little attention. In this work, we present a hardwarefocused benchmark evaluation of a number of state-of-the-art VPR techniques on public datasets. We consider popular single board computers, including ODroid, UP and Raspberry Pi 3, in addition to a commodity desktop and laptop for reference. We present our analysis based on several key metrics, including place-matching accuracy, image encoding time, descriptor matching time and memory needs. Key questions addressed include: (1) How does the performance accuracy of a VPR technique change with processor architecture? (2) How does power consumption vary for different VPR techniques and embedded platforms? (3) How much does descriptor size matter in comparison to today’s embedded platforms’ storage? (4) How does the performance of a high-end platform relate to an on-board low-end embedded platform for VPR? The extensive analysis and results in this work serve not only as a benchmark for the VPR community, but also provide useful insights for real-world adoption of VPR applications.

[1]  Ιωάννης Μανώλης,et al.  Οδηγός για το Raspberry Pi 3 Model B , 2017 .

[2]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[3]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[4]  Lingqiao Liu,et al.  Learning Context Flexible Attention Model for Long-Term Visual Place Recognition , 2018, IEEE Robotics and Automation Letters.

[5]  Kostas Alexis,et al.  Are State-of-the-art Visual Place Recognition Techniques any Good for Aerial Robotics? , 2019, ArXiv.

[6]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[7]  Victor S. Lempitsky,et al.  Aggregating Deep Convolutional Features for Image Retrieval , 2015, ArXiv.

[8]  Paul Newman,et al.  Adversarial Training for Adverse Conditions: Robust Metric Localisation Using Appearance Transfer , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Michael Milford,et al.  Multi-Process Fusion: Visual Place Recognition Using Multiple Image Processing Methods , 2019, IEEE Robotics and Automation Letters.

[10]  Hyeran Jeon,et al.  Tango: A Deep Neural Network Benchmark Suite for Various Accelerators , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[11]  Paul Newman,et al.  Scene Signatures: Localised and Point-less Features for Localisation , 2014, Robotics: Science and Systems.

[12]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[13]  Michael Milford,et al.  Deep learning features at scale for visual place recognition , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Klaus D. McDonald-Maier,et al.  Levelling the Playing Field: A Comprehensive Comparison of Visual Place Recognition Approaches under Changing Conditions , 2019, ArXiv.

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[19]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[20]  Ana Cristina Murillo,et al.  SURF features for efficient robot localization with omnidirectional images , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[21]  Michael Milford,et al.  Convolutional Neural Network-based Place Recognition , 2014, ICRA 2014.

[22]  Guoquan Huang,et al.  Lightweight Unsupervised Deep Loop Closure , 2018, Robotics: Science and Systems.

[23]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Klaus D. McDonald-Maier,et al.  Memorable Maps: A Framework for Re-Defining Places in Visual Place Recognition , 2018, IEEE Transactions on Intelligent Transportation Systems.

[25]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[26]  Indranil Palit,et al.  A Uniform Modeling Methodology for Benchmarking DNN Accelerators , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[27]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[28]  Gautam Singh Visual Loop Closing using Gist Descriptors in Manhattan World , 2010 .

[29]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[30]  Javier Civera,et al.  Condition-Invariant Multi-View Place Recognition , 2019, ArXiv.

[31]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[32]  Toon Goedemé,et al.  How to Choose the Best Embedded Processing Platform for on-Board UAV Image Processing ? , 2015, VISAPP.

[33]  Margarita Chli,et al.  Real-Time Wide-Baseline Place Recognition Using Depth Completion , 2019, IEEE Robotics and Automation Letters.

[34]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[35]  Chunhua Shen,et al.  Cross-Convolutional-Layer Pooling for Image Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[37]  Michael Milford,et al.  CoHOG: A Light-Weight, Compute-Efficient, and Training-Free Visual Place Recognition Technique for Changing Environments , 2020, IEEE Robotics and Automation Letters.

[38]  Michael Milford,et al.  A Holistic Visual Place Recognition Approach using Lightweight CNNs for Severe ViewPoint and Appearance Changes , 2018, ArXiv.

[39]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[40]  Simon Lacroix,et al.  Probabilistic place recognition with covisibility maps , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Jana Kosecka,et al.  Experiments in place recognition using gist panoramas , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.