Are State-of-the-art Visual Place Recognition Techniques any Good for Aerial Robotics?

Visual Place Recognition (VPR) has seen significant advances at the frontiers of matching performance and computational superiority over the past few years. However, these evaluations are performed for ground-based mobile platforms and cannot be generalized to aerial platforms. The degree of viewpoint variation experienced by aerial robots is complex, with their processing power and on-board memory limited by payload size and battery ratings. Therefore, in this paper, we collect $8$ state-of-the-art VPR techniques that have been previously evaluated for ground-based platforms and compare them on $2$ recently proposed aerial place recognition datasets with three prime focuses: a) Matching performance b) Processing power consumption c) Projected memory requirements. This gives a birds-eye view of the applicability of contemporary VPR research to aerial robotics and lays down the the nature of challenges for aerial-VPR.

[1]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[2]  Davide Scaramuzza,et al.  Air‐ground Matching: Appearance‐based GPS‐denied Urban Localization of Micro Aerial Vehicles , 2015, J. Field Robotics.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Chunhua Shen,et al.  Cross-Convolutional-Layer Pooling for Image Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[6]  Darshana Mistry,et al.  Comparison of Feature Detection and Matching Approaches : SIFT and SURF , 2017 .

[7]  Michael Milford,et al.  A Holistic Visual Place Recognition Approach using Lightweight CNNs for Severe ViewPoint and Appearance Changes , 2018, ArXiv.

[8]  Masatoshi Okutomi,et al.  24/7 Place Recognition by View Synthesis , 2015, CVPR.

[9]  Masatoshi Okutomi,et al.  Visual Place Recognition with Repetitive Structures , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Takeo Kanade,et al.  Real-time topometric localization , 2012, 2012 IEEE International Conference on Robotics and Automation.

[11]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[12]  Vijay Kumar,et al.  Online planning for energy-efficient and disturbance-aware UAV operations , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Simon Lacroix,et al.  Probabilistic place recognition with covisibility maps , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Michael Milford,et al.  Straightening sequence-search for appearance-invariant place recognition using robust motion estimation , 2017, ICRA 2017.

[15]  Titus Cieslewski,et al.  Efficient decentralized visual place recognition from full-image descriptors , 2017, 2017 International Symposium on Multi-Robot and Multi-Agent Systems (MRS).

[16]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Gordon Wyeth,et al.  FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day , 2010, 2010 IEEE International Conference on Robotics and Automation.

[18]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[19]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[20]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[21]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[22]  Klaus D. McDonald-Maier,et al.  Memorable Maps: A Framework for Re-Defining Places in Visual Place Recognition , 2018, IEEE Transactions on Intelligent Transportation Systems.

[23]  Ana Cristina Murillo,et al.  SURF features for efficient robot localization with omnidirectional images , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[24]  Michael Milford,et al.  Place Recognition with ConvNet Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free , 2015, Robotics: Science and Systems.

[25]  Klaus D. McDonald-Maier,et al.  Levelling the Playing Field: A Comprehensive Comparison of Visual Place Recognition Approaches under Changing Conditions , 2019, ArXiv.

[26]  Davide Scaramuzza,et al.  Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High-Speed Scenarios , 2017, IEEE Robotics and Automation Letters.

[27]  Gautam Singh Visual Loop Closing using Gist Descriptors in Manhattan World , 2010 .

[28]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[29]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[30]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31]  Guoquan Huang,et al.  Lightweight Unsupervised Deep Loop Closure , 2018, Robotics: Science and Systems.

[32]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[33]  Wolfram Burgard,et al.  Robust Visual Localization Across Seasons , 2018, IEEE Transactions on Robotics.

[34]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Titus Cieslewski,et al.  Data-Efficient Decentralized Visual SLAM , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Titus Cieslewski,et al.  Efficient Decentralized Visual Place Recognition Using a Distributed Inverted Index , 2017, IEEE Robotics and Automation Letters.

[37]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[38]  Kurt Konolige,et al.  CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching , 2008, ECCV.

[39]  Margarita Chli,et al.  Viewpoint-Tolerant Place Recognition Combining 2D and 3D Information for UAV Navigation , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[40]  Michael Milford,et al.  Convolutional Neural Network-based Place Recognition , 2014, ICRA 2014.

[41]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[42]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[43]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Michael Warren,et al.  Unaided stereo vision based pose estimation , 2010, ICRA 2010.

[45]  Frank Dellaert,et al.  Distributed trajectory estimation with privacy and communication constraints: A two-stage distributed Gauss-Seidel approach , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Michael Milford,et al.  Deep learning features at scale for visual place recognition , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[47]  Paul Newman,et al.  Scene Signatures: Localised and Point-less Features for Localisation , 2014, Robotics: Science and Systems.

[48]  Eduardo Montijano,et al.  Visual data association in narrow-bandwidth networks , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[49]  Inkyu Sa,et al.  Only look once, mining distinctive landmarks from ConvNet for visual place recognition , 2017, IROS 2017.

[50]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[51]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[52]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[53]  Antonios Gasteratos,et al.  Semantic mapping for mobile robotics tasks: A survey , 2015, Robotics Auton. Syst..

[54]  Jana Kosecka,et al.  Experiments in place recognition using gist panoramas , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[55]  Patrick Rives,et al.  Topological segmentation of indoors/outdoors sequences of spherical views , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[56]  Michael Milford,et al.  Sequence searching with deep-learnt depth for condition- and viewpoint-invariant route-based place recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[57]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.