Residual Based Sampling for Online Low Rank Approximation

We propose online algorithms for Column Subset Selection (CSS) and Principal Component Analysis (PCA), two methods that are widely employed for data analysis, summarization, and visualization. Given a data matrix A that is revealed one column at a time, the online CSS problems asks to keep a small set of columns, S, that best approximates the space spanned by the columns of A. As each column arrives, the algorithm must irrevocably decide whether to add it to S, or to ignore it. In the online PCA problem, the goal is to output a projection of each column to a low dimensional subspace. In other words, the algorithm must provide an embedding for each column as it arrives, which cannot be changed as new columns arrive. While both of these problems have been studied in the online setting, only additive approximations were known prior to our work. The core of our approach is an adaptive sampling technique that gives a practical and efficient algorithm for both of these problems. We prove that by sampling columns using their 'residual norm'' (i.e. their norm orthogonal to directions sampled so far), we end up with a significantly better dependence between the number of columns sampled, and the desired error in the approximation. We further show how to combine our algorithm "in series'' with prior algorithms. In particular, using the results of Boutsidis et al. and Frieze et al. that have additive guarantees, we show how to improve the bounds on the error of our algorithm.

[1]  Christos Boutsidis,et al.  Online Principal Components Analysis , 2015, SODA.

[2]  Jakub W. Pachocki,et al.  Online Row Sampling , 2016, APPROX-RANDOM.

[3]  Michel Goemans,et al.  Chernoff bounds , and some applications , 2014 .

[4]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[5]  Petros Drineas,et al.  Pass efficient algorithms for approximating large matrices , 2003, SODA '03.

[6]  Santosh S. Vempala,et al.  Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[7]  Manfred K. Warmuth,et al.  Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension , 2006, NIPS.

[8]  Sanjoy Dasgupta,et al.  The Fast Convergence of Incremental PCA , 2013, NIPS.

[9]  Jeff M. Phillips,et al.  Relative Errors for Deterministic Low-Rank Matrix Approximations , 2013, SODA.

[10]  Christos Boutsidis,et al.  Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[11]  Per-Gunnar Martinsson,et al.  Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.

[12]  Edo Liberty,et al.  Simple and deterministic matrix sketching , 2012, KDD.

[13]  Amit R. Indap,et al.  Genes mirror geography within Europe , 2008, Nature.

[14]  Aarti Singh,et al.  Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data , 2015, J. Mach. Learn. Res..

[15]  Amit Kumar,et al.  Clustering with Spectral Norm and the k-Means Algorithm , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[16]  Ioannis Mitliagkas,et al.  Memory Limited, Streaming PCA , 2013, NIPS.

[17]  Michael B. Cohen,et al.  Input Sparsity Time Low-rank Approximation via Ridge Leverage Score Sampling , 2015, SODA.

[18]  Nathan Srebro,et al.  Stochastic Optimization of PCA with Capped MSG , 2013, NIPS.

[19]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Wojciech Kotlowski,et al.  Bandit Principal Component Analysis , 2019, COLT.

[21]  David P. Woodruff,et al.  Numerical linear algebra in the streaming model , 2009, STOC '09.

[22]  Joel A. Tropp,et al.  Column subset selection, matrix factorization, and eigenvalue optimization , 2008, SODA.

[23]  Adam Meyerson,et al.  Online facility location , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[24]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[25]  Jiazhong Nie,et al.  Online PCA with Optimal Regret , 2016, J. Mach. Learn. Res..

[26]  Petros Drineas,et al.  Column Selection via Adaptive Sampling , 2015, NIPS.

[27]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[28]  Christos Boutsidis,et al.  Optimal CUR matrix decompositions , 2014, STOC.

[29]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[30]  Edo Liberty,et al.  Online with Spectral Bounds , 2015, COLT.

[31]  Santosh S. Vempala,et al.  Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[32]  Luis Rademacher,et al.  Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[33]  Alan M. Frieze,et al.  Fast Monte-Carlo algorithms for finding low-rank approximations , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[34]  Maxim Sviridenko,et al.  An Algorithm for Online K-Means Clustering , 2014, ALENEX.

[35]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).