Dimensionality Reductions in ℓ2 that Preserve Volumes and Distance to Affine Spaces

AbstractLet X be a subset of n points of the Euclidean space, and let 0 < ε < 1. A classical result of Johnson and Lindenstrauss [JL] states that there is a projection of X onto a subspace of dimension O(ε-2 log n) with distortion ≤ 1+ ε. We show a natural extension of the above result to a stronger preservation of the geometry of finite spaces. By a k-fold increase of the number of dimensions used compared with [JL], a good preservation of volumes and of distances between points and affine spaces is achieved. Specifically, we show how to embed a subset of size n of the Euclidean space into a O(ε-2 log n)-dimensional Euclidean space, so that no set of size s ≤ k changes its volume by more than (1 + εs-1. Moreover, distances of points from affine hulls of sets of at most k - 1 points in the space do not change by more than a factor of 1 + ε. A consequence of the above with k = 3 is that angles can be preserved using asymptotically the same number of dimensions as the one used in [JL]. Our method can be applied to many problems with high-dimensional nature such as Projective Clustering and Approximated Nearest Affine Neighbour Search. In particular, it shows a first polylogarithmic query time approximation algorithm to the latter. We also show a structural application that for volume respecting embedding in the sense introduced by Feige [F], the host space need not generally be of dimensionality greater than polylogarithmic in the size of the graph.