A Comparative Study on Hierarchical Navigable Small World Graphs

Hierarchical navigable small world (HNSW) graphs get more and more popular on large-scale nearest neighbor search tasks since the source codes were released two years ago. The attractiveness of this approach lies in its superior performance over most of the known nearest neighbor search approaches as well as its genericness to various distance measures. In this paper, several comparative studies have been conducted on this search approach. The role of hierarchical structure in HNSW and the function of HNSW graph itself are investigated. We find that the hierarchical structure in HNSW could not achieve "a much better logarithmic complexity scaling" as it was claimed in the paper, particularly on high dimensional data. Moreover, we find that similar high search speed efficiency as HNSW could be achieved with the support of flat k-NN graph after graph diversification. Finally, we point out the difficulty, that is faced by most of the graph based search approaches, is directly linked to "curse of dimensionality". HNSW, like other graph based approaches, is unable to address such difficulty.

[1]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Vladimir Krylov,et al.  Approximate nearest neighbor algorithm based on navigable small world graphs , 2014, Inf. Syst..

[3]  Xuemin Lin,et al.  Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and Improvement , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  Wan-Lei Zhao k-NN Graph Construction: a Generic Online Approach , 2018, ArXiv.

[5]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[6]  Cheng Wang,et al.  Approximate Nearest Neighbor Search by Residual Vector Quantization , 2010, Sensors.

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[9]  Jingdong Wang,et al.  Composite Quantization for Approximate Nearest Neighbor Search , 2014, ICML.

[10]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[11]  Victor S. Lempitsky,et al.  Efficient Indexing of Billion-Scale Datasets of Deep Descriptors , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[13]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[15]  Deng Cai,et al.  EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph , 2016, ArXiv.

[16]  Yasin Abbasi-Yadkori,et al.  Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph , 2011, IJCAI.

[17]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[18]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[19]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[20]  Yury A. Malkov,et al.  Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[22]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[23]  Tom Drummond,et al.  FANNG: Fast Approximate Nearest Neighbour Graphs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[25]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.