Streaming PCA with Many Missing Entries

We consider the streaming memory-constrained principal component analysis (PCA) problem with missing entries, where the available storage is linear in the dimensionality of the problem, and each vector has so many missing entries that matrix completion is not possible. SVD-based methods cannot work because of the memory constraint, while imputation-based updates fail when faced with too many erasures. For this problem, we propose a method based on a block power update approach introduced in [14]. We show on synthetic as well as benchmark data sets, that our approach outperforms existing approaches for streaming PCA by a significant margin for several interesting problem settings. We also consider the popular spiked covariance model with randomly missing entries, and obtain the first known global convergence guarantees for this problem. We show that our method converges to the true “spike” using a number of samples that is linear in the dimension of the data. Moreover, our memory requirement is also linear in the ambient dimension. Thus, both memory and sample complexity have optimal scaling with dimension.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  P. Wedin On angles between subspaces of a finite dimensional inner product space , 1983 .

[3]  E. Oja,et al.  On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix , 1985 .

[4]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[5]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[6]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[7]  Ruslan Salakhutdinov,et al.  Practical Large-Scale Optimization for Max-norm Regularization , 2010, NIPS.

[8]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[9]  Robert D. Nowak,et al.  Online identification and tracking of subspaces from highly incomplete information , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[10]  John C. S. Lui,et al.  Online Robust Subspace Tracking from Partial Information , 2011, ArXiv.

[11]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[12]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[13]  Nathan Srebro,et al.  Stochastic optimization for PCA and PLS , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Shankar Vembu,et al.  Chemical gas sensor drift compensation using classifier ensembles , 2012 .

[15]  Karim Lounici High-dimensional covariance matrix estimation with missing observations , 2012, 1201.2577.

[16]  Inderjit S. Dhillon,et al.  Low rank modeling of signed networks , 2012, KDD.

[17]  Sanjoy Dasgupta,et al.  The Fast Convergence of Incremental PCA , 2013, NIPS.

[18]  Ioannis Mitliagkas,et al.  Memory Limited, Streaming PCA , 2013, NIPS.

[19]  Stephen J. Wright,et al.  Local Convergence of an Algorithm for Subspace Identification from Partial Data , 2013, Found. Comput. Math..