Localized sketching for matrix multiplication and ridge regression

We consider sketched approximate matrix multiplication and ridge regression in the novel setting of localized sketching, where at any given point, only part of the data matrix is available. This corresponds to a block diagonal structure on the sketching matrix. We show that, under mild conditions, block diagonal sketching matrices require only O(stable rank / \epsilon^2) and $O( stat. dim. \epsilon)$ total sample complexity for matrix multiplication and ridge regression, respectively. This matches the state-of-the-art bounds that are obtained using global sketching matrices. The localized nature of sketching considered allows for different parts of the data matrix to be sketched independently and hence is more amenable to computation in distributed and streaming settings and results in a smaller memory and computational footprint.

[1]  Petros Drineas,et al.  An Iterative, Sketching-based Framework for Ridge Regression , 2018, ICML.

[2]  David P. Woodru Sketching as a Tool for Numerical Linear Algebra , 2014 .

[3]  Michael W. Mahoney,et al.  Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments , 2015, Proceedings of the IEEE.

[4]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[5]  David P. Woodruff,et al.  Sharper Bounds for Regularized Data Fitting , 2016, APPROX-RANDOM.

[6]  Michael B. Wakin,et al.  The Restricted Isometry Property for Random Block Diagonal Matrices , 2012, ArXiv.

[7]  Yang Liu,et al.  Fast Relative-Error Approximation Algorithm for Ridge Regression , 2015, UAI.

[8]  Holger Rauhut,et al.  A Mathematical Introduction to Compressive Sensing , 2013, Applied and Numerical Harmonic Analysis.

[9]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[10]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[11]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[12]  David P. Woodruff,et al.  Faster Kernel Ridge Regression Using Sketching and Preconditioning , 2016, SIAM J. Matrix Anal. Appl..

[13]  Brian McWilliams,et al.  LOCO: Distributing Ridge Regression with Random Projections , 2014, 1406.3469.

[14]  Holger Rauhut,et al.  Suprema of Chaos Processes and the Restricted Isometry Property , 2012, ArXiv.

[15]  Petros Drineas,et al.  Feature Selection for Ridge Regression with Provable Guarantees , 2016, Neural Computation.

[16]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[17]  Shusen Wang,et al.  Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging , 2017, ICML.

[18]  David P. Woodruff,et al.  Optimal Approximate Matrix Product in Terms of Stable Rank , 2015, ICALP.

[19]  慧 廣瀬 A Mathematical Introduction to Compressive Sensing , 2015 .