Hierarchical Encoding of Sequential Data With Compact and Sub-Linear Storage Cost

Snapshot-based visual localization is an important problem in several computer vision and robotics applications such as Simultaneous Localization And Mapping (SLAM). To achieve real-time performance in very large-scale environments with massive amounts of training and map data, techniques such as approximate nearest neighbor search (ANN) algorithms are used. While several state-of-the-art variants of quantization and indexing techniques have demonstrated to be efficient in practice, their theoretical memory cost still scales at least linearly with the training data (i.e., O(n) where n is the number of instances in the database), since each data point must be associated with at least one code vector. To address these limitations, in this paper we present a totally new hierarchical encoding approach that enables a sub-linear storage scale. The algorithm exploits the widespread sequential nature of sensor information streams in robotics and autonomous vehicle applications and achieves, both theoretically and experimentally, sub-linear scalability in storage required for a given environment size. Furthermore, the associated query time of our algorithm is also of sub-linear complexity. We benchmark the performance of the proposed algorithm on several real-world benchmark datasets and experimentally validate the theoretical sub-linearity of our approach, while also showing that our approach yields competitive absolute storage performance as well.

[1]  Giorgos Tolias,et al.  Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  Jian Sun,et al.  Optimized Product Quantization for Approximate Nearest Neighbor Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Victor S. Lempitsky,et al.  Efficient Indexing of Billion-Scale Datasets of Deep Descriptors , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Huu Le,et al.  DeepVQ: A Deep Network Architecture for Vector Quantization , 2018, CVPR Workshops.

[7]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[8]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[9]  Victor S. Lempitsky,et al.  AnnArbor: Approximate Nearest Neighbors Using Arborescence Coding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  José Ruíz Ascencio,et al.  Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[11]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[12]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Torsten Sattler,et al.  Image Retrieval for Image-Based Localization Revisited , 2012, BMVC.

[15]  Yannis Avrithis,et al.  Locally Optimized Product Quantization for Approximate Nearest Neighbor Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Huu Le,et al.  Binary Constrained Deep Hashing Network for Image Retrieval Without Manual Annotation , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[20]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[21]  Masatoshi Okutomi,et al.  24/7 Place Recognition by View Synthesis , 2015, CVPR.

[22]  Matthijs Douze,et al.  Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Ngai-Man Cheung,et al.  Selective Deep Convolutional Features for Image Retrieval , 2017, ACM Multimedia.

[24]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[25]  Victor S. Lempitsky,et al.  Tree quantization for large-scale similarity search and classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Victor S. Lempitsky,et al.  Product Split Trees , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[28]  Torsten Sattler,et al.  Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[31]  Christian Böhm,et al.  Towards an Optimal Subspace for K-Means , 2017, KDD.

[32]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[33]  Michael Milford,et al.  Sequence searching with deep-learnt depth for condition- and viewpoint-invariant route-based place recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Meena Mahajan,et al.  The Planar k-means Problem is NP-hard I , 2009 .

[35]  Michael Milford,et al.  Rhythmic Representations: Learning Periodic Patterns for Scalable Place Recognition at a Sublinear Storage Cost , 2018, IEEE Robotics and Automation Letters.

[36]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[37]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.