Lattice-based Locality Sensitive Hashing is Optimal

Locality sensitive hashing (LSH) was introduced by Indyk and Motwani (STOC `98) to give the first sublinear time algorithm for the c-approximate nearest neighbor (ANN) problem using only polynomial space. At a high level, an LSH family hashes "nearby" points to the same bucket and "far away" points to different buckets. The quality of measure of an LSH family is its LSH exponent, which helps determine both query time and space usage. In a seminal work, Andoni and Indyk (FOCS `06) constructed an LSH family based on random ball partitioning of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA `07) and O'Donnell, Wu and Zhou (TOCT `14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH. In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem.

[1]  Cordelia Schmid,et al.  Query adaptative locality sensitive hashing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Alexandr Andoni,et al.  Practical and Optimal LSH for Angular Distance , 2015, NIPS.

[4]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[5]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[6]  Daniel Dadush,et al.  Solving the Closest Vector Problem in 2^n Time -- The Discrete Gaussian Strikes Again! , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[7]  Simon Litsyn,et al.  Lattices which are good for (almost) everything , 2003, Proceedings 2003 IEEE Information Theory Workshop (Cat. No.03EX674).

[8]  Léo Ducas,et al.  The closest vector problem in tensored root lattices of type A and in their duals , 2018, Des. Codes Cryptogr..

[9]  Philip A. Wilsey,et al.  A GPGPU Algorithm for c-Approximate r-Nearest Neighbor Search in High Dimensions , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[10]  Ravi Kannan,et al.  Minkowski's Convex Body Theorem and Integer Programming , 1987, Math. Oper. Res..

[11]  W. Schmidt The measure of the set of admissible lattices , 1958 .

[12]  Gregory Valiant,et al.  Finding Correlations in Subquadratic Time, with Applications to Learning Parities and Juntas , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[13]  Alexandr Andoni,et al.  Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing , 2015, SoCG.

[14]  Alexandr Andoni,et al.  Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors , 2016, SODA.

[15]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[16]  Anja Becker,et al.  New directions in nearest neighbor searching with applications to lattice sieving , 2016, IACR Cryptol. ePrint Arch..

[17]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[18]  Carl Ludwig Siegel,et al.  A Mean Value Theorem in Geometry of Numbers , 1945 .

[19]  Daniel Dadush,et al.  Short Paths on the Voronoi Graph and Closest Vector Problem with Preprocessing , 2014, SODA.

[20]  C. A. Rogers Lattice Coverings of Space: The Minkowski–Hlawka Theorem , 1958 .

[21]  Alexandr Andoni,et al.  Optimal Data-Dependent Hashing for Approximate Near Neighbors , 2015, STOC.

[22]  Thijs Laarhoven,et al.  Sieving for Shortest Vectors in Lattices Using Angular Locality-Sensitive Hashing , 2015, CRYPTO.

[23]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[24]  Kave Eshghi,et al.  Locality sensitive hash functions based on concomitant rank order statistics , 2008, KDD.

[25]  O. Amrani,et al.  Efficient bounded-distance decoding of the hexacode and associated decoders for the Leech lattice and the Golay code , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[26]  Tobias Christiani,et al.  A Framework for Similarity Search with Space-Time Tradeoffs using Locality-Sensitive Filtering , 2016, SODA.

[27]  Yi Wu,et al.  Optimal Lower Bounds for Locality-Sensitive Hashing (Except When q is Tiny) , 2014, TOCT.

[28]  Miklós Ajtai,et al.  Random lattices and a conjectured 0 - 1 law about their polynomial time computable properties , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[29]  Rajeev Motwani,et al.  Lower bounds on locality sensitive hashing , 2005, SCG '06.

[30]  Tanaka Yuzuru,et al.  Spherical LSH for Approximate Nearest Neighbor Search on Unit Hypersphere , 2007 .

[31]  Frank Vallentin,et al.  Sphere Covering, Lattices, and Tilings (in Low Dimensions) , 2003 .

[32]  Alex J. Grant,et al.  Finding a Closest Point in a Lattice of Voronoi's First Kind , 2014, SIAM J. Discret. Math..

[33]  Alexandr Andoni,et al.  Beyond Locality-Sensitive Hashing , 2013, SODA.