Lower bounds for high dimensional nearest neighbor search and related problems

In spite of extensive and continuing research, for various geometric search problems (such as nearest neighbor search), the best algorithms known have performance that degrades exponentially in the dimension. This phenomenon is sometimes called the curse of dimensionality. Recent results [37, 38, 40] show that in some sense it is possible to avoid the curse of dimensionality for the approximate nearest neighbor search problem. But must the exact nearest neighbor search problem suffer this curse? We provide some evidence in support of the curse. Specifically we investigate the exact nearest neighbor search problem and the related problem of exact partial match within the asymmetric communication model first used by Miltersen [43] to study data structure problems. We derive non-trivial asymptotic lower bounds for the exact problem that stand in contrast to known algorithms for approximate nearest neighbor search.

[1]  Jeff Erickson,et al.  Space-Time Tradeoffs for Emptiness Queries , 2000, SIAM J. Comput..

[2]  Jeff Erickson New lower bounds for Hopcroft's problem , 1995, SCG '95.

[3]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[4]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[5]  Andrew Chi-Chih Yao,et al.  Should tables be sorted? , 1981, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[6]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Dima Grigoriev,et al.  Randomized complexity lower bounds , 1998, STOC '98.

[8]  Dennis J. Volper,et al.  The Complexity of Partial Match Retrieval in a Dynamic Setting , 1982, J. Algorithms.

[9]  Noam Nisan,et al.  Neighborhood preserving hashing and approximate queries , 1994, SODA '94.

[10]  Ronald Fagin,et al.  Fuzzy queries in multimedia database systems , 1998, PODS '98.

[11]  Peter Bro Miltersen,et al.  On data structures and asymmetric communication complexity , 1994, STOC '95.

[12]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[13]  Dima Grigoriev,et al.  Randomized Complexity Lower Bound for Arrangements and Polyhedra , 1999, Discret. Comput. Geom..

[14]  Piotr Indyk On approximate nearest neighbors in non-Euclidean spaces , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[15]  Miklós Ajtai,et al.  A lower bound for finding predecessors in Yao's cell probe model , 1988, Comb..

[16]  Michael E. Saks,et al.  Time-Space Tradeoffs for Branching Programs , 2001, J. Comput. Syst. Sci..

[17]  Yuval Rabani,et al.  Tighter bounds for nearest neighbor search and related problems in the cell probe model , 2000, STOC '00.

[18]  Danny Dolev,et al.  Finding the neighborhood of a query in a dictionary , 1993, [1993] The 2nd Israel Symposium on Theory and Computing Systems.

[19]  Marek Karpinski,et al.  Randomized ( n 2 ) Lower Bound for , 2007 .

[20]  Jon M. Kleinberg,et al.  Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[21]  Jeff Erickson,et al.  New lower bounds for Hopcroft's problem , 1995, SCG '95.

[22]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[23]  Peter Bro Miltersen The Bit Probe Complexity Measure Revisited , 1993, STACS.

[24]  Miklós Ajtai,et al.  Determinism versus non-determinism for linear time RAMs (extended abstract) , 1999, STOC '99.

[25]  Christos H. Papadimitriou,et al.  On the analysis of indexing schemes , 1997, PODS '97.

[26]  Jirí Matousek,et al.  Ray shooting and parametric search , 1992, STOC '92.

[27]  S. Meiser,et al.  Point Location in Arrangements of Hyperplanes , 1993, Inf. Comput..

[28]  Michael L. Fredman A Lower Bound on the Complexity of Orthogonal Range Queries , 1981, JACM.

[29]  HastieTrevor,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1996 .

[30]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[31]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Gerald Salton,et al.  Automatic text processing , 1988 .

[33]  L. H. Harper Optimal numberings and isoperimetric problems on graphs , 1966 .

[34]  Miklós Ajtai,et al.  A non-linear time lower bound for Boolean branching programs , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[35]  Michael L. Fredman,et al.  Lower Bounds on the Complexity of Some Optimal Data Structures , 1981, SIAM J. Comput..

[36]  Jirí Matousek,et al.  Reporting Points in Halfspaces , 1992, Comput. Geom..

[37]  B. Xiao New bounds in cell probe model , 1992 .

[38]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[39]  Arnold W. M. Smeulders,et al.  Image Databases and Multi-Media Search , 1998, Image Databases and Multi-Media Search.

[40]  Kenneth L. Clarkson,et al.  A Randomized Algorithm for Closest-Point Queries , 1988, SIAM J. Comput..

[41]  Eyal Kushilevitz,et al.  Communication Complexity , 1997, Adv. Comput..

[42]  Richard J. Lipton,et al.  Multidimensional Searching Problems , 1976, SIAM J. Comput..

[43]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Other Conferences.

[44]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification and Regression , 1995, NIPS.

[45]  BeamePaul,et al.  Time-Space Tradeoffs for Branching Programs , 2001 .

[46]  Peter Bro Miltersen On the Cell Probe Complexity of Polynomial Evaluation , 1995, Theor. Comput. Sci..

[47]  Ronald L. Rivest,et al.  Partial-Match Retrieval Algorithms , 1976, SIAM J. Comput..

[48]  Robert Sedgewick,et al.  Fast algorithms for sorting and searching strings , 1997, SODA '97.

[49]  Timothy M. Chan Approximate Nearest Neighbor Queries Revisited , 1998, Discret. Comput. Geom..

[50]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[51]  Michael Ben-Or,et al.  Lower bounds for algebraic computation trees , 1983, STOC.

[52]  Paul Beame,et al.  Time-space Tradeoos for Branching Programs , 1998 .

[53]  Peter Bro Miltersen Lower bounds for union-split-find related problems on random access machines , 1994, STOC '94.

[54]  Marshall W. Bern,et al.  Approximate Closest-Point Queries in High Dimensions , 1993, Inf. Process. Lett..

[55]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Electronic Imaging.

[56]  Andrew Chi-Chih Yao,et al.  A general approach to d-dimensional geometric queries , 1985, STOC '85.

[57]  Kenneth L. Clarkson,et al.  An algorithm for approximate closest-point queries , 1994, SCG '94.

[58]  J. Matoussek Reporting points in halfspaces , 1991, FOCS 1991.

[59]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.