Streaming PCA and Subspace Tracking: The Missing Data Case

For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a timely manner for decision making. This often is coupled with the missing data problem, such that only a small fraction of data attributes are observed. These complications impose significant, and unconventional, constraints on the problem of streaming principal component analysis (PCA) and subspace tracking, which is an essential building block for many inference tasks in signal processing and machine learning. This survey article reviews a variety of classical and recent algorithms for solving this problem with low computational and memory complexities, particularly those applicable in the big data regime with missing data. We illustrate that streaming PCA and subspace tracking algorithms can be understood through algebraic and geometric perspectives, and they need to be adjusted carefully to handle missing data. Both asymptotic and nonasymptotic convergence guarantees are reviewed. Finally, we benchmark the performance of several competitive algorithms in the presence of missing data for both well-conditioned and ill-conditioned systems.

[1]  Dimitri P. Bertsekas,et al.  Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey , 2015, ArXiv.

[2]  Stephen J. Wright,et al.  On GROUSE and incremental SVD , 2013, 2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[3]  Camillo J. Taylor,et al.  Online completion of Ill-conditioned low-rank matrices , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[4]  M. Brand,et al.  Fast low-rank modifications of the thin singular value decomposition , 2006 .

[5]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[6]  I. Jolliffe Principal Component Analysis , 2002 .

[7]  Moritz Hardt,et al.  The Noisy Power Method: A Meta Algorithm with Applications , 2013, NIPS.

[8]  E. Oja,et al.  On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix , 1985 .

[9]  Daniel P. Robinson,et al.  Sparse Subspace Clustering with Missing Entries , 2015, ICML.

[10]  Matthew Brand,et al.  Incremental Singular Value Decomposition of Uncertain Data with Missing Values , 2002, ECCV.

[11]  H. Kushner Convergence of recursive adaptive and identification procedures via weak convergence theory , 1977 .

[12]  Robert D. Nowak,et al.  K-subspaces with missing data , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[13]  Tamer Basar,et al.  Analysis of Recursive Stochastic Algorithms , 2001 .

[14]  Nigel Boston,et al.  A characterization of deterministic sampling patterns for low-rank matrix completion , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  Rabab Kreidieh Ward,et al.  A Low-Rank Matrix Recovery Approach for Energy Efficient EEG Acquisition for a Wireless Body Area Network , 2014, Sensors.

[16]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[17]  Yonina C. Eldar,et al.  Subspace Estimation From Incomplete Observations: A High-Dimensional Analysis , 2018, IEEE Journal of Selected Topics in Signal Processing.

[18]  Ilkka Karasalo,et al.  Estimating the covariance matrix by signal subspace averaging , 1986, IEEE Trans. Acoust. Speech Signal Process..

[19]  Michael B. Wakin,et al.  Streaming Principal Component Analysis From Incomplete Data , 2016, J. Mach. Learn. Res..

[20]  Morteza Mardani,et al.  Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors , 2014, IEEE Transactions on Signal Processing.

[21]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[22]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[23]  Maria-Florina Balcan,et al.  An Improved Gap-Dependency Analysis of the Noisy Power Method , 2016, COLT.

[24]  Liming Wang,et al.  Stochastic Approximation and Memory-Limited Subspace Tracking for Poisson Streaming Data , 2018, IEEE Transactions on Signal Processing.

[25]  Wei-Yong Yan,et al.  Global convergence of Oja's subspace algorithm for principal component extraction , 1998, IEEE Trans. Neural Networks.

[26]  Yue M. Lu,et al.  Online learning for sparse PCA in high dimensions: Exact dynamics and phase transitions , 2016, 2016 IEEE Information Theory Workshop (ITW).

[27]  R. Vershynin How Close is the Sample Covariance Matrix to the Actual Covariance Matrix? , 2010, 1004.3484.

[28]  Rebecca Willett,et al.  Change-Point Detection for High-Dimensional Time Series With Missing Data , 2012, IEEE Journal of Selected Topics in Signal Processing.

[29]  Yudong Chen,et al.  Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation: Recent Theory and Fast Algorithms via Convex and Nonconvex Optimization , 2018, IEEE Signal Processing Magazine.

[30]  Sajid Javed,et al.  Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery , 2017, IEEE Signal Processing Magazine.

[31]  Sajid Javed,et al.  Robust PCA, Subspace Learning, and Tracking , 2017 .

[32]  E. Schmidt Zur Theorie der linearen und nichtlinearen Integralgleichungen , 1907 .

[33]  Sajid Javed,et al.  Robust PCA and Robust Subspace Tracking , 2017, ArXiv.

[34]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[35]  Louis L. Scharf,et al.  The SVD and reduced rank signal processing , 1991, Signal Process..

[36]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[37]  G. Karypis,et al.  Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems , 2002 .

[38]  Yonina C. Eldar,et al.  Subspace Estimation from Incomplete Observations: A Precise High-Dimensional Analysis , 2017 .

[39]  Tong Zhang,et al.  Near-optimal stochastic approximation for online principal component estimation , 2016, Math. Program..

[40]  Robert D. Nowak,et al.  Online identification and tracking of subspaces from highly incomplete information , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[41]  Stephen J. Wright,et al.  Online algorithms for factorization-based structure from motion , 2013, IEEE Winter Conference on Applications of Computer Vision.

[42]  Bin Yang,et al.  Projection approximation subspace tracking , 1995, IEEE Trans. Signal Process..

[43]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[44]  Tony Gustafsson,et al.  Instrumental variable subspace tracking using projection approximation , 1998, IEEE Trans. Signal Process..

[45]  Justin K. Romberg,et al.  An Overview of Low-Rank Matrix Recovery From Incomplete Observations , 2016, IEEE Journal of Selected Topics in Signal Processing.

[46]  Yue M. Lu,et al.  Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA , 2017, ArXiv.

[47]  A. Sznitman Topics in propagation of chaos , 1991 .

[48]  Namrata Vaswani,et al.  Provable Dynamic Robust PCA or Robust Subspace Tracking , 2017, 2018 IEEE International Symposium on Information Theory (ISIT).

[49]  Christopher De Sa,et al.  Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems , 2014, ICML.

[50]  Y. Hua,et al.  Fast orthonormal PAST algorithm , 2000, IEEE Signal Processing Letters.

[51]  Gene H. Golub,et al.  Matrix computations , 1983 .

[52]  Akshay Krishnamurthy,et al.  Low-Rank Matrix and Tensor Completion via Adaptive Sampling , 2013, NIPS.

[53]  Yuanzhi Li,et al.  First Efficient Convergence for Streaming k-PCA: A Global, Gap-Free, and Near-Optimal Rate , 2016, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[54]  Gilad Lerman,et al.  Robust Stochastic Principal Component Analysis , 2014, AISTATS.

[55]  Sanjoy Dasgupta,et al.  The Fast Convergence of Incremental PCA , 2013, NIPS.

[56]  Xin J. Hunt,et al.  Online Data Thinning via Multi-Subspace Tracking , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Karim Abed-Meraim,et al.  A New Look at the Power Method for Fast Subspace Tracking , 1999, Digit. Signal Process..

[58]  Bin Yang,et al.  Asymptotic convergence analysis of the projection approximation subspace tracking algorithms , 1996, Signal Process..

[59]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[60]  Shuicheng Yan,et al.  Online Robust PCA via Stochastic Optimization , 2013, NIPS.

[61]  Laura Balzano,et al.  Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Scott G. Ghiocel,et al.  Missing Data Recovery by Exploiting Low-Dimensionality in Power System Synchrophasor Measurements , 2016, IEEE Transactions on Power Systems.

[63]  Yuejie Chi Nearest subspace classification with missing data , 2013, 2013 Asilomar Conference on Signals, Systems and Computers.

[64]  Stephen J. Wright,et al.  Local Convergence of an Algorithm for Subspace Identification from Partial Data , 2013, Found. Comput. Math..

[65]  S.T. Smith Subspace tracking with full rank updates , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[66]  Roland Badeau,et al.  Fast approximated power iteration subspace tracking , 2005, IEEE Transactions on Signal Processing.

[67]  A. Robert Calderbank,et al.  PETRELS: Subspace estimation and tracking from partial observations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[68]  Namrata Vaswani,et al.  Online (and Offline) Robust PCA: Novel Algorithms and Performance Guarantees , 2016, AISTATS.

[69]  Jang-Gyu Lee,et al.  On updating the singular value decomposition , 1996, Proceedings of International Conference on Communication Technology. ICCT '96.

[70]  Namrata Vaswani,et al.  An Online Algorithm for Separating Sparse and Low-Dimensional Signal Sequences From Their Sum , 2013, IEEE Transactions on Signal Processing.

[71]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[72]  Dejiao Zhang,et al.  Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation From Undersampled Data , 2016, ArXiv.

[73]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[74]  G. W. STEWARTt ON THE EARLY HISTORY OF THE SINGULAR VALUE DECOMPOSITION * , 2022 .

[75]  G. Golub,et al.  Tracking a few extreme singular values and vectors in signal processing , 1990, Proc. IEEE.

[76]  Dejiao Zhang,et al.  Global Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation , 2015, AISTATS.

[77]  A. Robert Calderbank,et al.  PETRELS: Parallel Subspace Estimation and Tracking by Recursive Least Squares From Partial Observations , 2012, IEEE Transactions on Signal Processing.

[78]  Ohad Shamir,et al.  Convergence of Stochastic Gradient Descent for PCA , 2015, ICML.

[79]  Morteza Mardani,et al.  Online Categorical Subspace Learning for Sketching Big Data with Misses , 2016, IEEE Transactions on Signal Processing.

[80]  Dehui Yang,et al.  SNIPE for Memory-Limited PCA From Incomplete Data , 2016, ArXiv.

[81]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[82]  Jean Pierre Delmas,et al.  Subspace Tracking for Signal Processing , 2010 .

[83]  Gene H. Golub,et al.  Some modified matrix eigenvalue problems , 1973, Milestones in Matrix Computation.

[84]  Prateek Jain,et al.  Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja's Algorithm , 2016, COLT.

[85]  Ioannis Mitliagkas,et al.  Memory Limited, Streaming PCA , 2013, NIPS.

[86]  Chun-Liang Li,et al.  Rivalry of Two Families of Algorithms for Memory-Restricted Streaming PCA , 2015, AISTATS.

[87]  T. P. Krasulina The method of stochastic approximation for the determination of the least eigenvalue of a symmetrical matrix , 1969 .

[88]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.