Sparser Johnson-Lindenstrauss Transforms

We give two different and simple constructions for dimensionality reduction in <i>ℓ</i><sub>2</sub> via linear mappings that are sparse: only an <i>O</i>(<i>ϵ</i>)-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion 1 + <i>ϵ</i> with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas [2003] and Dasgupta et al. [2010]. Such distributions can be used to speed up applications where <i>ℓ</i><sub>2</sub> dimensionality reduction is used.

[1]  E. Wigner Characteristic Vectors of Bordered Matrices with Infinite Dimensions I , 1955 .

[2]  F. T. Wright,et al.  A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables , 1971 .

[3]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[4]  János Komlós,et al.  The eigenvalues of random symmetric matrices , 1981, Comb..

[5]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[6]  Peter Frankl,et al.  The Johnson-Lindenstrauss lemma and the sphericity of some graphs , 1987, J. Comb. Theory B.

[7]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[8]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[9]  Piotr Indyk,et al.  Algorithmic applications of low-distortion geometric embeddings , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[10]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[11]  R. Gregory Taylor,et al.  Modern computer algebra , 2002, SIGA.

[12]  Noga Alon,et al.  Problems and results in extremal combinatorics--I , 2003, Discret. Math..

[13]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[14]  Sanjoy Dasgupta,et al.  An elementary proof of a theorem of Johnson and Lindenstrauss , 2003, Random Struct. Algorithms.

[15]  Mikkel Thorup,et al.  Tabulation based 4-universal hashing with applications to second moment estimation , 2004, SODA '04.

[16]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[17]  Santosh S. Vempala,et al.  An algorithmic theory of learning: Robust concepts and random projection , 1999, Machine Learning.

[18]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[19]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[20]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[21]  Jirí Matousek,et al.  On variants of the Johnson–Lindenstrauss lemma , 2008, Random Struct. Algorithms.

[22]  Nir Ailon,et al.  Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes , 2008, SODA '08.

[23]  Moni Naor,et al.  Derandomized Constructions of k-Wise (Almost) Independent Permutations , 2005, Algorithmica.

[24]  Kilian Q. Weinberger,et al.  Feature hashing for large scale multitask learning , 2009, ICML '09.

[25]  Bernard Chazelle,et al.  The Fast Johnson--Lindenstrauss Transform and Approximate Nearest Neighbors , 2009, SIAM J. Comput..

[26]  David P. Woodruff,et al.  Numerical linear algebra in the streaming model , 2009, STOC '09.

[27]  Anirban Dasgupta,et al.  A sparse Johnson: Lindenstrauss transform , 2010, STOC '10.

[28]  Rafail Ostrovsky,et al.  Rademacher Chaos, Random Eulerian Graphs and The Sparse Johnson-Lindenstrauss Transform , 2010, ArXiv.

[29]  Daniel M. Kane,et al.  A Derandomized Sparse Johnson-Lindenstrauss Transform , 2010, Electron. Colloquium Comput. Complex..

[30]  Jan Vyb'iral A variant of the Johnson-Lindenstrauss lemma for circulant matrices , 2010, 1002.2847.

[31]  Raghu Meka Almost Optimal Explicit Johnson-Lindenstrauss Transformations , 2010, ArXiv.

[32]  Daniel M. Kane,et al.  A Sparser Johnson-Lindenstrauss Transform , 2010, ArXiv.

[33]  Rachel Ward,et al.  New and Improved Johnson-Lindenstrauss Embeddings via the Restricted Isometry Property , 2010, SIAM J. Math. Anal..

[34]  Aicke Hinrichs,et al.  Johnson‐Lindenstrauss lemma for circulant matrices* * , 2010, Random Struct. Algorithms.

[35]  Daniel M. Kane,et al.  Almost Optimal Explicit Johnson-Lindenstrauss Families , 2011, APPROX-RANDOM.

[36]  David P. Woodruff,et al.  Fast moment estimation in data streams in optimal space , 2010, STOC '11.

[37]  Nir Ailon,et al.  An almost optimal unrestricted fast Johnson-Lindenstrauss transform , 2010, SODA '11.

[38]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[39]  Piotr Indyk,et al.  Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality , 2012, Theory Comput..

[40]  Mikkel Thorup,et al.  Tabulation-Based 5-Independent Hashing with Applications to Linear Probing and Second Moment Estimation , 2012, SIAM J. Comput..

[41]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[42]  Huy L. Nguyen,et al.  OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[43]  David P. Woodruff,et al.  Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Subconstant Error , 2011, TALG.

[44]  Huy L. Nguyen,et al.  Sparsity lower bounds for dimensionality reducing maps , 2012, STOC '13.

[45]  Michael W. Mahoney,et al.  Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression , 2012, STOC '13.