Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method

The CUR matrix decomposition and the related Nystrom method build low-rank approximations of data matrices by select- ing a small number of representative rows and columns of the data. Here, we intro- duce novel spectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds are much tighter than existing ones for matri- ces with rapid spectrum decay, and they justify the use of a constant amount of over- sampling relative to the rank parameter k, i.e, when the number of columns/rows is ' = k + O(1). We demonstrate our analysis on a novel deterministic algorithm, StableCUR, which additionally eliminates a previously unrecognized source of po- tential instability in CUR decompositions. While our algorithm accepts any method of row and column selection, we implement it with a recent column selection scheme with strong singular value bounds. Empirical re- sults on various classes of real world data matrices demonstrate that our algorithm is as ecient as, and often outperforms, com- peting algorithms.

[1]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[2]  Alex Gittens,et al.  The spectral norm error of the naive Nystrom extension , 2011, ArXiv.

[3]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[4]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[5]  Petros Drineas,et al.  Tensor-CUR Decompositions for Tensor-Based Data , 2008, SIAM J. Matrix Anal. Appl..

[6]  Zhihua Zhang,et al.  Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling , 2013, J. Mach. Learn. Res..

[7]  James T. Kwok,et al.  Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction , 2010, IEEE Transactions on Neural Networks.

[8]  Christian Bauckhage,et al.  Deterministic CUR for Improved Large-Scale Data Analysis: An Empirical Study , 2012, SDM.

[9]  Christos Boutsidis,et al.  Near-Optimal Column-Based Matrix Reconstruction , 2014, SIAM J. Comput..

[10]  Siam J. Sci,et al.  SUBSPACE ITERATION RANDOMIZATION AND SINGULAR VALUE PROBLEMS , 2015 .

[11]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[12]  Christos Boutsidis,et al.  Optimal CUR matrix decompositions , 2014, STOC.

[13]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[14]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[15]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[16]  Ming Gu,et al.  An Efficient Algorithm for Unweighted Spectral Graph Sparsification , 2014, ArXiv.

[17]  Michael W. Mahoney,et al.  CUR from a Sparse Optimization Viewpoint , 2010, NIPS.

[18]  Zhihua Zhang,et al.  Efficient Algorithms and Error Analysis for the Modified Nystrom Method , 2014, AISTATS.

[19]  Michael W. Mahoney,et al.  Revisiting the Nystrom Method for Improved Large-scale Machine Learning , 2013, J. Mach. Learn. Res..