论文信息 - Relative Errors for Deterministic Low-Rank Matrix Approximations

Relative Errors for Deterministic Low-Rank Matrix Approximations

We consider processing an n x d matrix A in a stream with row-wise updates according to a recent algorithm called Frequent Directions (Liberty, KDD 2013). This algorithm maintains an l x d matrix Q deterministically, processing each row in O(d l^2) time; the processing time can be decreased to O(d l) with a slight modification in the algorithm and a constant increase in space. We show that if one sets l = k+ k/eps and returns Q_k, a k x d matrix that is the best rank k approximation to Q, then we achieve the following properties: ||A - A_k||_F^2 <= ||A||_F^2 - ||Q_k||_F^2 <= (1+eps) ||A - A_k||_F^2 and where pi_{Q_k}(A) is the projection of A onto the rowspace of Q_k then ||A - pi_{Q_k}(A)||_F^2 <= (1+eps) ||A - A_k||_F^2. We also show that Frequent Directions cannot be adapted to a sparse version in an obvious way that retains the l original rows of the matrix, as opposed to a linear combination or sketch of the rows.

Jeff M. Phillips | Mina Ghashami | J. M. Phillips | Mina Ghashami

[1] Matthew Brand,et al. Incremental Singular Value Decomposition of Uncertain Data with Missing Values , 2002, ECCV.

[2] Graham Cormode,et al. The continuous distributed monitoring model , 2013, SGMD.

[3] Richard M. Karp,et al. A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[4] Petros Drineas,et al. CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[5] David P. Woodruff,et al. Numerical linear algebra in the streaming model , 2009, STOC '09.

[6] Alan M. Frieze,et al. Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[7] David P. Woodruff,et al. Low rank approximation and regression in input sparsity time , 2012, STOC '13.

[8] Dan Feldman,et al. Turning big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering , 2013, SODA.

[9] Erik D. Demaine,et al. Identifying frequent items in sliding windows over on-line packet streams , 2003, IMC '03.

[10] Piotr Indyk,et al. Space-optimal heavy hitters with strong error bounds , 2010, TODS.

[11] S. Muthukrishnan,et al. Data streams: algorithms and applications , 2005, SODA '03.

[12] Mark Rudelson,et al. Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[13] Tamás Sarlós,et al. Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[14] Ming-Hsuan Yang,et al. Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[15] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[16] Michael Lindenbaum,et al. Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[17] Dimitris Achlioptas,et al. Fast computation of low rank matrix approximations , 2001, STOC '01.

[18] Santosh S. Vempala,et al. Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[19] Petros Drineas,et al. FAST MONTE CARLO ALGORITHMS FOR MATRICES III: COMPUTING A COMPRESSED APPROXIMATE MATRIX DECOMPOSITION∗ , 2004 .

[20] Petros Drineas,et al. Pass efficient algorithms for approximating large matrices , 2003, SODA '03.

[21] Petros Drineas,et al. FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[22] Graham Cormode,et al. Mergeable summaries , 2012, PODS '12.

[23] Divyakant Agrawal,et al. An integrated efficient solution for computing frequent and top-k elements in data streams , 2006, TODS.

[24] Edo Liberty,et al. Simple and deterministic matrix sketching , 2012, KDD.

[25] Ralph R. Martin,et al. Incremental Eigenanalysis for Classification , 1998, BMVC.

[26] Erik D. Demaine,et al. Frequency Estimation of Internet Packet Streams with Limited Space , 2002, ESA.

[27] Petros Drineas,et al. Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication , 2006, SIAM J. Comput..

[28] Huy L. Nguyen,et al. OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[29] Jayadev Misra,et al. Finding Repeated Elements , 1982, Sci. Comput. Program..

[30] S. Muthukrishnan,et al. Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[31] Christos Boutsidis,et al. Near Optimal Column-Based Matrix Reconstruction , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.