Large Scale Nearest Neighbors Search Based on Neighborhood Graph

Large scale approximate k-nearest neighbors search is an important and very useful technique for many multimedia retrieval applications. Most of existing search algorithms used the centralized indexing approaches and thus cannot meet the needs to search upon large scale datasets. This paper proposes an efficient and distributed approximate k-nearest neighbors search algorithm over a billion high-dimensional visual descriptors. We propose a randomized partitioning strategy and then design a two-layer distributed indexing scheme based on a neighborhood graph for large scale k-nearest neighbors search. The experimental results show that our method achieves excellent performance and scalability.

[1]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[2]  Matthijs Douze,et al.  Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Yasin Abbasi-Yadkori,et al.  Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph , 2011, IJCAI.

[5]  Shipeng Li,et al.  Query-driven iterated neighborhood graph search for large scale indexing , 2012, ACM Multimedia.

[6]  Godfried T. Toussaint,et al.  The relative neighbourhood graph of a finite planar set , 1980, Pattern Recognit..

[7]  Ashish Goel,et al.  Efficient distributed locality sensitive hashing , 2012, CIKM.

[8]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Karl Aberer,et al.  Distributed similarity search in high dimensions using locality sensitive hashing , 2009, EDBT '09.

[10]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[11]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[12]  Hongbin Zha,et al.  Optimizing kd-trees for scalable visual descriptor indexing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[14]  Vladimir Krylov,et al.  Scalable Distributed Algorithm for Approximate Nearest Neighbor Search Problem in High Dimensional General Metric Spaces , 2012, SISAP.

[15]  Laurent Amsaleg,et al.  NV-Tree: nearest neighbors at the billion scale , 2011, ICMR '11.

[16]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[17]  Gonzalo Navarro Searching in metric spaces by spatial approximation , 2002, The VLDB Journal.

[18]  Jonathon S. Hare,et al.  ImageTerrier: an extensible platform for scalable high-performance image retrieval , 2012, ICMR.

[19]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[20]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision , 2008, IEEE Trans. Neural Networks.

[21]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[22]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[23]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[24]  Benjamin B. Kimia,et al.  Metric-based shape retrieval in large databases , 2002, Object recognition supported by user interaction for service robots.