Better algorithms for high-dimensional proximity problems via asymmetric embeddings

In this paper we give several results based on randomized embeddings of <i>l</i><inf>2</inf> into <i>l</i><inf>∞</inf>(or "<i>l</i><inf>∞</inf>-like") spaces. Our first result is a (1 + ε)-distortion <i>asymmetric</i> embedding of <i>n</i> points in <i>l</i><inf>2</inf> into <i>l</i><inf>∞</inf> with polylog(<i>n</i>) dimension, for any 1 + ε. This gives the first known <i>O</i>(1)- approximate nearest neighbor algorithm with fast query time and almost polynomial space for a product of Euclidean norms, a common generalization of both <i>l</i><inf>2</inf> and <i>l</i><inf>∞</inf> norms. Our embedding also clarifies the relative complexity of approximate nearest neighbor in <i>l</i><inf>2</inf> and <i>l</i><inf>∞</inf> spaces.Our second result in a (1 + ε)-approximate algorithm for the diameter of <i>n</i> points in <i>l<sup>d</sup><inf>2</inf></i>, running in time <i>Õ</i>(<i>dn</i><sup>1+l/(1+ε)</sup><sup>2</sup>); the algorithm is fully dynamic. This improves several previous algorithms for this problem (see Table 1 for more information).

[1]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[2]  Ömer Egecioglu,et al.  Approximating the Diameter of a Set of Points in the Euclidean Space , 1989, Inf. Process. Lett..

[3]  Marco Pellegrini,et al.  On computing the diameter of a point set in high dimensional Euclidean space , 1999, Theor. Comput. Sci..

[4]  David Eppstein,et al.  Dynamic Euclidean minimum spanning trees and extrema of binary functions , 1995, Discret. Comput. Geom..

[5]  Piotr Indyk Dimensionality reduction techniques for proximity problems , 2000, SODA '00.

[6]  Allan Borodin,et al.  Lower bounds for high dimensional nearest neighbor search and related problems , 1999, STOC '99.

[7]  Jon M. Kleinberg,et al.  Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[8]  Allan Borodin,et al.  Subquadratic approximation algorithms for clustering problems in high dimensional spaces , 1999, STOC '99.

[9]  Robert E. Tarjan,et al.  Scaling and related techniques for geometry problems , 1984, STOC '84.

[10]  J. Matousek,et al.  On the distortion required for embedding finite metric spaces into normed spaces , 1996 .

[11]  Pankaj K. Agarwal,et al.  Farthest Neighbors, Maximum Spanning Trees and Related Problems in Higher Dimensions , 1991, Comput. Geom..

[12]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[13]  Piotr Indyk,et al.  Approximate nearest neighbor algorithms for Frechet distance via product metrics , 2002, SCG '02.

[14]  Piotr Indyk,et al.  Approximate nearest neighbor algorithms for Hausdorff metrics via embeddings , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[15]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[16]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[17]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[18]  Ronald Fagin Fuzzy Queries in Multimedia Database Systems Invited Paper: Proc. 1998 Acm Sigact-sigmod-sigart Symposium on Principles of Database Systems , 1998 .