Lp Row Sampling by Lewis Weights

We give a simple algorithm to efficiently sample the rows of a matrix while preserving the p-norms of its product with vectors. Given an n * d matrix A, we find with high probability and in input sparsity time an A' consisting of about d log d rescaled rows of A such that |Ax|1 is close to |A'x|1 for all vectors x. We also show similar results for all Lp that give nearly optimal sample bounds in input sparsity time. Our results are based on sampling by "Lewis weights", which can be viewed as statistical leverage scores of a reweighted matrix. We also give an elementary proof of the guarantees of this sampling process for L1.

[1]  Gary L. Miller,et al.  Runtime guarantees for regression problems , 2011, ITCS '13.

[2]  Nikhil Srivastava,et al.  Graph sparsification by effective resistances , 2008, SIAM J. Comput..

[3]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[4]  B. Bollobás THE VOLUME OF CONVEX BODIES AND BANACH SPACE GEOMETRY (Cambridge Tracts in Mathematics 94) , 1991 .

[5]  Anirban Dasgupta,et al.  Sampling algorithms and coresets for ℓp regression , 2007, SODA '08.

[6]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[7]  Ralph Duncan James,et al.  Proceedings of the International Congress of Mathematicians , 1975 .

[8]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[9]  P. Wojtaszczyk Banach Spaces For Analysts: Preface , 1991 .

[10]  S. Muthukrishnan,et al.  Sampling algorithms for l2 regression and applications , 2006, SODA '06.

[11]  M. Talagrand Embedding Subspaces of L p in ℓ p N , 1995 .

[12]  Richard Peng,et al.  Uniform Sampling for Matrix Approximation , 2014, ITCS.

[13]  David P. Woodruff,et al.  The Fast Cauchy Transform and Faster Robust Linear Regression , 2012, SIAM J. Comput..

[14]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[15]  M. Rudelson,et al.  Lp-moments of random vectors via majorizing measures , 2005, math/0507023.

[16]  Richard Peng,et al.  ℓp Row Sampling by Lewis Weights , 2014, ArXiv.

[17]  Richard Peng,et al.  $\ell_p$ Row Sampling by Lewis Weights , 2014, 1412.0588.

[18]  M. Talagrand Embedding Subspaces of L 1 into l N 1 , 1990 .

[19]  Nikhil Srivastava,et al.  Twice-ramanujan sparsifiers , 2008, STOC '09.

[20]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[21]  Rudolf Ahlswede,et al.  Strong converse for identification via quantum channels , 2000, IEEE Trans. Inf. Theory.

[22]  M. Ledoux,et al.  Comparison Theorems, Random Geometry and Some Limit Theorems for Empirical Processes , 1989 .

[23]  G. Pisier The volume of convex bodies and Banach space geometry , 1989 .

[24]  M. Talagrand Embedding subspaces of ₁ into ^{}₁ , 1990 .

[25]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[26]  Gary L. Miller,et al.  Iterative Row Sampling , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[27]  Michael W. Mahoney,et al.  Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression , 2012, STOC '13.

[28]  David P. Woodruff,et al.  Subspace embeddings for the L1-norm with applications , 2011, STOC '11.

[29]  David P. Woodruff,et al.  Subspace Embeddings and \(\ell_p\)-Regression Using Exponential Random Variables , 2013, COLT.

[30]  Huy L. Nguyen,et al.  OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[31]  J. Lindenstrauss,et al.  Approximation of zonoids by zonotopes , 1989 .

[32]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[33]  E.J. Candes Compressive Sampling , 2022 .

[34]  D. R. Lewis Finite dimensional subspaces of $L_{p}$ , 1978 .

[35]  Richard Peng,et al.  Improved Spectral Sparsification and Numerical Algorithms for SDD Matrices , 2012, STACS.

[36]  Yin Tat Lee,et al.  Path Finding Methods for Linear Programming: Solving Linear Programs in Õ(vrank) Iterations and Faster Algorithms for Maximum Flow , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.