Query Range Sensitive Probability Guided Multi-probe Locality Sensitive Hashing

Locality Sensitive Hashing (LSH) is proposed to construct indexes for high-dimensional approximate similarity search. Multi-Probe LSH (MPLSH) is a variation of LSH which can reduce the number of hash tables. Based on the idea of MPLSH, this paper proposes a novel probability model and a query-adaptive algorithm to generate the optimal multi-probe sequence for range queries. Our probability model takes the query range into account to generate the probe sequence which is optimal for range queries. Furthermore, our algorithm does not use a fixed number of probe steps but a query-adaptive threshold to control the search quality. We do the experiments on an open dataset to evaluate our method. The experimental results show that our method can probe fewer points than MPLSH for getting the same recall. As a result, our method can get an average acceleration of 10% compared to MPLSH.

[1]  Laurent Amsaleg,et al.  Locality sensitive hashing: A comparison of hash function types and querying mechanisms , 2010, Pattern Recognit. Lett..

[2]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[4]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[5]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[6]  Jeremy Buhler,et al.  Efficient large-scale sequence comparison by locality-sensitive hashing , 2001, Bioinform..

[7]  Martial Hebert,et al.  Rapid object indexing using locality sensitive hashing and joint 3D-signature space estimation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jeremy Buhler,et al.  Large-Scale Sequence Comparison by Locality-Sensitive Hashing , 2001 .

[9]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[10]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[11]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[12]  Olivier Buisson,et al.  A posteriori multi-probe locality sensitive hashing , 2008, ACM Multimedia.

[13]  Rina Panigrahy,et al.  Entropy based nearest neighbor search in high dimensions , 2005, SODA '06.

[14]  Cordelia Schmid,et al.  Query adaptative locality sensitive hashing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.