Graph Reordering for Cache-Efficient Near Neighbor Search

Graph search is one of the most successful algorithmic trends in near neighbor search. Several of the most popular and empirically successful algorithms are, at their core, a simple walk along a pruned near neighbor graph. Such algorithms consistently perform at the top of industrial speed benchmarks for applications such as embedding search. However, graph traversal applications often suffer from poor memory access patterns, and near neighbor search is no exception to this rule. Our measurements show that popular search indices such as the hierarchical navigable smallworld graph (HNSW) can have poor cache miss performance. To address this problem, we apply graph reordering algorithms to near neighbor graphs. Graph reordering is a memory layout optimization that groups commonly-accessed nodes together in memory. We present exhaustive experiments applying several reordering algorithms to a leading graph-based near neighbor method based on the HNSW index. We find that reordering improves the query time by up to 40%, and we demonstrate that the time needed to reorder the graph is negligible compared to the time required to construct the index.

[1]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Xuemin Lin,et al.  Speedup Graph Processing by Graph Ordering , 2016, SIGMOD Conference.

[3]  J. A. George Computer implementation of the finite element method , 1971 .

[4]  Wan-Lei Zhao,et al.  A Comparative Study on Hierarchical Navigable Small World Graphs , 2019, ArXiv.

[5]  Ilya Safro,et al.  Multiscale approach for the network compression-friendly ordering , 2010, J. Discrete Algorithms.

[6]  Liudmila Prokhorenkova,et al.  Graph-based Nearest Neighbor Search: From Practice to Theory , 2019, ICML.

[7]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[8]  Yury A. Malkov,et al.  Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision , 2008, IEEE Trans. Neural Networks.

[10]  Leonid Boytsov,et al.  Non-Metric Space Library Manual , 2015, ArXiv.

[11]  Martin Aumüller,et al.  ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms , 2018, SISAP.

[12]  Piotr Indyk,et al.  Learning Space Partitions for Nearest Neighbor Search , 2019, ICLR.

[13]  Choon Hui Teo,et al.  Semantic Product Search , 2019, KDD.

[14]  Benjamin B. Kimia,et al.  Metric-based shape retrieval in large databases , 2002, Object recognition supported by user interaction for service robots.

[15]  E. Cuthill,et al.  Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.

[16]  Daisuke Miyazaki,et al.  Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data , 2018, ArXiv.

[17]  Matei Zaharia,et al.  Making caches work for graph analytics , 2016, 2017 IEEE International Conference on Big Data (Big Data).

[18]  Thijs Laarhoven Graph-based time-space trade-offs for approximate near neighbors , 2018, Symposium on Computational Geometry.

[19]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Wan-Lei Zhao,et al.  Graph based Nearest Neighbor Search: Promises and Failures , 2019 .

[21]  Marcos R. Vieira,et al.  A survey on graph-based methods for similarity searches in metric spaces , 2020, Inf. Syst..

[22]  Tom Drummond,et al.  FANNG: Fast Approximate Nearest Neighbour Graphs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Brian D. Ondov,et al.  Mash: fast genome and metagenome distance estimation using MinHash , 2015, Genome Biology.

[24]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[25]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[26]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[27]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[28]  Piyush Kumar,et al.  Fast construction of k-nearest neighbor graphs for point clouds , 2010, IEEE Transactions on Visualization and Computer Graphics.

[29]  Parth Nagarkar,et al.  A Survey on Locality Sensitive Hashing Algorithms and their Applications , 2021, ArXiv.

[30]  Yasushi Makihara,et al.  Object recognition supported by user interaction for service robots , 2002, Object recognition supported by user interaction for service robots.

[31]  Silvio Lattanzi,et al.  On compressing social networks , 2009, KDD.

[32]  Masajiro Iwasaki Pruned Bi-directed K-nearest Neighbor Graph for Proximity Search , 2016, SISAP.

[33]  Vladimir Krylov,et al.  Approximate nearest neighbor algorithm based on navigable small world graphs , 2014, Inf. Syst..

[34]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[35]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[36]  Leonid Boytsov,et al.  Engineering Efficient and Effective Non-metric Space Library , 2013, SISAP.

[37]  Timothy M. Chan Closest-point problems simplified on the RAM , 2002, SODA '02.

[38]  Brandon Lucia,et al.  When is Graph Reordering an Optimization? Studying the Effect of Lightweight Graph Reordering Across Applications and Input Graphs , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).

[39]  Boris Grot,et al.  A Closer Look at Lightweight Graph Reordering , 2019, 2019 IEEE International Symposium on Workload Characterization (IISWC).

[40]  Robert Erra,et al.  Reordering Very Large Graphs for Fun & Prot , 2015 .