Tree-based Search Graph for Approximate Nearest Neighbor Search

Nearest neighbor search supports important applications in many domains, such as database, machine learning, computer vision. Since the computational cost for accurate search is too high, the community turned to the research of approximate nearest neighbor search (ANNS). Among them, graph-based algorithm is one of the most important branches. Research by Fu et al. shows that the algorithms based on Monotonic Search Network (MSNET), such as NSG and NSSG, have achieved the state-of-the-art search performance in efficiency. The MSNET is dedicated to achieving monotonic search with minimal out-degree of nodes to pursue high efficiency. However, the current MSNET designs did not optimize the probability of the monotonic search, and the lower bound of the probability is only 50%. If they fail in monotonic search stage, they have to suffer tremendous backtracking cost to achieve the required accuracy. This will cause performance problems in search efficiency. To address this problem, we propose (r,p)-MSNET, which achieves guaranteed probability on monotonic search. Due to the high building complexity of a strict (r,p)-MSNET, we propose TBSG, which is an approximation with low complexity. Experiments conducted on four million-scaled datasets show that TBSG yields better or comparable search performance than all recent state-of-the-art algorithms. Our code has ∗Corresponding author 1Hunan University, Changsha, 410082, China; E-mail: fanxbin@hnu.edu.cn, xpwang@hnu.edu.cn, xuelei@hnu.edu.cn 2National University of Defense Technology, Changsha, 410073, China; E-mail: kailu@nudt.edu.cn 3National Key Laboratory of Science and Technology on Information System Security, Beijing, China Preprint submitted to Elsevier January 11, 2022 ar X iv :2 20 1. 03 23 7v 1 [ cs .I R ] 1 0 Ja n 20 22 been released on Github.

[1]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[3]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  P. Alam,et al.  R , 1823, The Herodotus Encyclopedia.

[5]  Deng Cai,et al.  Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph , 2017, Proc. VLDB Endow..

[6]  Arnab Bhattacharya,et al.  HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces , 2018, Proc. VLDB Endow..

[7]  Jingdong Wang,et al.  Composite Quantization for Approximate Nearest Neighbor Search , 2014, ICML.

[8]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[9]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[10]  Jose A. Costa,et al.  Estimating Local Intrinsic Dimension with k-Nearest Neighbor Graphs , 2005, IEEE/SP 13th Workshop on Statistical Signal Processing, 2005.

[11]  Deng Cai,et al.  EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph , 2016, ArXiv.

[12]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[13]  D. W. Dearholt,et al.  Monotonic Search Networks For Computer Vision Databases , 1988, Twenty-Second Asilomar Conference on Signals, Systems and Computers.

[14]  Jian Sun,et al.  Optimized Product Quantization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Yury A. Malkov,et al.  Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Vladimir Krylov,et al.  Approximate nearest neighbor algorithm based on navigable small world graphs , 2014, Inf. Syst..

[18]  Victor Lempitsky,et al.  Additive Quantization for Extreme Vector Compression , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[20]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[21]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[23]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[24]  Deng Cai,et al.  Density Sensitive Hashing , 2012, IEEE Transactions on Cybernetics.

[25]  Tom Drummond,et al.  FANNG: Fast Approximate Nearest Neighbour Graphs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xuemin Lin,et al.  Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and Improvement , 2016, IEEE Transactions on Knowledge and Data Engineering.

[27]  Marián Boguñá,et al.  Navigability of Complex Networks , 2007, ArXiv.

[28]  Anthony K. H. Tung,et al.  LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single Index , 2016, SIGMOD Conference.

[29]  Yasin Abbasi-Yadkori,et al.  Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph , 2011, IJCAI.

[30]  Mike Izbicki,et al.  Faster cover trees , 2015, ICML.

[31]  Lawrence Cayton,et al.  Fast nearest neighbor retrieval for bregman divergences , 2008, ICML '08.