论文信息 - Hierarchical Encoding of Sequential Data With Compact and Sub-Linear Storage Cost

Hierarchical Encoding of Sequential Data With Compact and Sub-Linear Storage Cost

Snapshot-based visual localization is an important problem in several computer vision and robotics applications such as Simultaneous Localization And Mapping (SLAM). To achieve real-time performance in very large-scale environments with massive amounts of training and map data, techniques such as approximate nearest neighbor search (ANN) algorithms are used. While several state-of-the-art variants of quantization and indexing techniques have demonstrated to be efficient in practice, their theoretical memory cost still scales at least linearly with the training data (i.e., O(n) where n is the number of instances in the database), since each data point must be associated with at least one code vector. To address these limitations, in this paper we present a totally new hierarchical encoding approach that enables a sub-linear storage scale. The algorithm exploits the widespread sequential nature of sensor information streams in robotics and autonomous vehicle applications and achieves, both theoretically and experimentally, sub-linear scalability in storage required for a given environment size. Furthermore, the associated query time of our algorithm is also of sub-linear complexity. We benchmark the performance of the proposed algorithm on several real-world benchmark datasets and experimentally validate the theoretical sub-linearity of our approach, while also showing that our approach yields competitive absolute storage performance as well.

[1] Giorgos Tolias,et al. Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Svetlana Lazebnik,et al. Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[3] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[4] Jian Sun,et al. Optimized Product Quantization for Approximate Nearest Neighbor Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Victor S. Lempitsky,et al. Efficient Indexing of Billion-Scale Datasets of Deep Descriptors , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Huu Le,et al. DeepVQ: A Deep Network Architecture for Vector Quantization , 2018, CVPR Workshops.

[7] Ondrej Chum,et al. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[8] Gordon Wyeth,et al. SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[9] Victor S. Lempitsky,et al. AnnArbor: Approximate Nearest Neighbors Using Arborescence Coding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10] José Ruíz Ascencio,et al. Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[11] Michael I. Jordan,et al. Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[12] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Josef Sivic,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Torsten Sattler,et al. Image Retrieval for Image-Based Localization Revisited , 2012, BMVC.

[15] Yannis Avrithis,et al. Locally Optimized Product Quantization for Approximate Nearest Neighbor Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[17] Huu Le,et al. Binary Constrained Deep Hashing Network for Image Retrieval Without Manual Annotation , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18] Jan-Michael Frahm,et al. Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Allen Gersho,et al. Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[20] Peter I. Corke,et al. Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[21] Masatoshi Okutomi,et al. 24/7 Place Recognition by View Synthesis , 2015, CVPR.

[22] Matthijs Douze,et al. Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23] Ngai-Man Cheung,et al. Selective Deep Convolutional Features for Image Retrieval , 2017, ACM Multimedia.

[24] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[25] Victor S. Lempitsky,et al. Tree quantization for large-scale similarity search and classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Victor S. Lempitsky,et al. Product Split Trees , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] J. M. M. Montiel,et al. ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[28] Torsten Sattler,et al. Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[31] Christian Böhm,et al. Towards an Optimal Subspace for K-Means , 2017, KDD.

[32] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[33] Michael Milford,et al. Sequence searching with deep-learnt depth for condition- and viewpoint-invariant route-based place recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34] Meena Mahajan,et al. The Planar k-means Problem is NP-hard I , 2009 .

[35] Michael Milford,et al. Rhythmic Representations: Learning Periodic Patterns for Scalable Place Recognition at a Sublinear Storage Cost , 2018, IEEE Robotics and Automation Letters.

[36] Koby Crammer,et al. On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[37] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.