Matrix Entry-wise Sampling : Simple is Best [ Extended Abstract ]

Sparsfying matrices is a ubiquitous operation in large scale machine learning, data mining and signal processing. More formally, given a large matrix A, we aim to find another matrix B, such that }A B} ¤ ε with B being significantly sparser than A. Using B as a surrogate for A is more efficient and often provides provably good approximations for many tasks. In this paper, we suggest an element-wise sampling scheme for producing B. We prove it is superior to previously suggested schemes using a relatively new matrix-valued version of the Bernstein inequality, which is known to be tight up to logarithmic factors. Moreover, the sampling scheme can be executed in the streaming model where single matrix non-zeros are presented to the algorithm in an arbitrary order. We support our theoretical findings with experimental results that corroborate our claims.

[1]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[2]  Nir Ailon,et al.  An almost optimal unrestricted fast Johnson-Lindenstrauss transform , 2010, SODA '11.

[3]  Anupam Gupta,et al.  An elementary proof of the Johnson-Lindenstrauss Lemma , 1999 .

[4]  S. Muthukrishnan,et al.  Faster least squares approximation , 2007, Numerische Mathematik.

[5]  Rudolf Ahlswede,et al.  Strong converse for identification via quantum channels , 2000, IEEE Trans. Inf. Theory.

[6]  Christos Boutsidis,et al.  Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[7]  R. Oliveira Sums of random Hermitian matrices and an inequality by Rudelson , 2010, 1004.3821.

[8]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[9]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[10]  E. Wigner On the Distribution of the Roots of Certain Symmetric Matrices , 1958 .

[11]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[12]  N. Alon,et al.  On the concentration of eigenvalues of random symmetric matrices , 2000, math-ph/0009032.

[13]  Daniel M. Kane,et al.  Sparser Johnson-Lindenstrauss Transforms , 2010, JACM.

[14]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[15]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[17]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[18]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[19]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[20]  Kilian Q. Weinberger,et al.  Feature hashing for large scale multitask learning , 2009, ICML '09.

[21]  F. Juhász On the spectrum of a random graph , 1981 .

[22]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[23]  Per-Gunnar Martinsson,et al.  Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.

[24]  Petros Drineas,et al.  Pass efficient algorithms for approximating large matrices , 2003, SODA '03.

[25]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[26]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[27]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[28]  Santosh S. Vempala,et al.  Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[29]  Dimitris Achlioptas,et al.  Fast computation of low rank matrix approximations , 2001, STOC '01.

[30]  David P. Woodruff,et al.  Numerical linear algebra in the streaming model , 2009, STOC '09.

[31]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[32]  Petros Drineas,et al.  A note on element-wise matrix sparsification via a matrix-valued Bernstein inequality , 2010, Inf. Process. Lett..

[33]  Sanjeev Arora,et al.  A Fast Random Sampling Algorithm for Sparsifying Matrices , 2006, APPROX-RANDOM.