Sketched SVD: Recovering Spectral Features from Compressive Measurements

We consider a streaming data model in which n sensors observe individual streams of data, presented in a turnstile model. Our goal is to analyze the singular value decomposition (SVD) of the matrix of data defined implicitly by the stream of updates. Each column i of the data matrix is given by the stream of updates seen at sensor i. Our approach is to sketch each column of the matrix, forming a "sketch matrix" Y, and then to compute the SVD of the sketch matrix. We show that the singular values and right singular vectors of Y are close to those of X, with small relative error. We also believe that this bound is of independent interest in non-streaming and non-distributed data collection settings. Assuming that the data matrix X is of size Nxn, then with m linear measurements of each column of X, we obtain a smaller matrix Y with dimensions mxn. If m = O(k \epsilon^{-2} (log(1/\epsilon) + log(1/\delta)), where k denotes the rank of X, then with probability at least 1-\delta, the singular values \sigma'_j of Y satisfy the following relative error result (1-\epsilon)^(1/2)<= \sigma'_j/\sigma_j <= (1 + \epsilon)^(1/2) as compared to the singular values \sigma_j of the original matrix X. Furthermore, the right singular vectors v'_j of Y satisfy ||v_j-v_j'||_2 <= min(sqrt{2}, (\epsilon\sqrt{1+\epsilon})/(\sqrt{1-\epsilon}) max_{i\neq j} (\sqrt{2}\sigma_i\sigma_j)/(min_{c\in[-1,1]}(|\sigma^2_i-\sigma^2_j(1+c\epsilon)|))) as compared to the right singular vectors v_j of X. We apply this result to obtain a streaming graph algorithm to approximate the eigenvalues and eigenvectors of the graph Laplacian in the case where the graph has low rank (many connected components).

[1]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[2]  Hanchao Qi,et al.  Invariance of principal components under low-dimensional random projection of the data , 2012, 2012 19th IEEE International Conference on Image Processing.

[3]  Sudipto Guha,et al.  Graph sketches: sparsification, spanners, and subgraphs , 2012, PODS.

[4]  David P. Woodruff,et al.  Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with sub-constant error , 2011, SODA '11.

[5]  Ren-Cang Li Relative Perturbation Theory: I. Eigenvalue and Singular Value Variations , 1998, SIAM J. Matrix Anal. Appl..

[6]  J. Barlow,et al.  Computing accurate eigensystems of scaled diagonally dominant matrices: LAPACK working note No. 7 , 1988 .

[7]  Anirban Dasgupta,et al.  A sparse Johnson: Lindenstrauss transform , 2010, STOC '10.

[8]  R. DeVore,et al.  A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[9]  Anirban Dasgupta,et al.  Spectral analysis of random graphs with skewed degree distributions , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[10]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[11]  David P. Woodruff,et al.  Coresets and sketches for high dimensional subspace approximation problems , 2010, SODA '10.

[12]  Maurice Clint,et al.  The Evaluation of Eigenvalues and Eigenvectors of Real Symmetric Matrices by Simultaneous Iteration , 1970, Comput. J..

[13]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[14]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[15]  Ren-Cang Li,et al.  Relative Perturbation Theory: II. Eigenspace and Singular Subspace Variations , 1996, SIAM J. Matrix Anal. Appl..

[16]  James E. Fowler,et al.  Compressive-Projection Principal Component Analysis , 2009, IEEE Transactions on Image Processing.

[17]  H. Rutishauser Computational aspects of F. L. Bauer's simultaneous iteration method , 1969 .

[18]  Mark A. Davenport,et al.  Random Observations on Random Observations: Sparse Signal Acquisition and Processing , 2010 .

[19]  Ilse C. F. Ipsen,et al.  Relative perturbation techniques for singular value problems , 1995 .

[20]  Anna R. Karlin,et al.  Spectral analysis of data , 2001, STOC '01.

[21]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[22]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[23]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[24]  Sudipto Guha,et al.  Analyzing graph structure via linear measurements , 2012, SODA.

[25]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[26]  Daniel A. Spielman,et al.  Spectral Graph Theory , 2012 .

[27]  B. Peeters,et al.  Stochastic System Identification for Operational Modal Analysis: A Review , 2001 .

[28]  Roy Mathias,et al.  A relative perturbation bound for positive definite matrices , 1998 .

[29]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[30]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[31]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.