The Fast Convergence of Incremental PCA

We consider a situation in which we see samples in $\mathbb{R}^d$ drawn i.i.d. from some distribution with mean zero and unknown covariance A. We wish to compute the top eigenvector of A in an incremental fashion - with an algorithm that maintains an estimate of the top eigenvector in O(d) space, and incrementally adjusts the estimate with each new data point that arrives. Two classical such schemes are due to Krasulina (1969) and Oja (1983). We give finite-sample convergence rates for both.

[1]  T. P. Krasulina The method of stochastic approximation for the determination of the least eigenvalue of a symmetrical matrix , 1969 .

[2]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[3]  E. Oja,et al.  On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix , 1985 .

[4]  R. Durrett Probability: Theory and Examples , 1993 .

[5]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[6]  Juyang Weng,et al.  Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Gilles Blanchard,et al.  Statistical properties of Kernel Prinicipal Component Analysis , 2019 .

[9]  Gilles Blanchard,et al.  On the Convergence of Eigenspaces in Kernel Principal Component Analysis , 2005, NIPS.

[10]  Manfred K. Warmuth,et al.  Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension , 2006, NIPS.

[11]  Nathan Srebro,et al.  Stochastic optimization for PCA and PLS , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Jing Lei,et al.  Minimax Rates of Estimation for Sparse PCA in High Dimensions , 2012, AISTATS.

[13]  Ohad Shamir,et al.  Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.

[14]  T. Cai,et al.  Sparse PCA: Optimal rates and adaptive estimation , 2012, 1211.1309.

[15]  Ioannis Mitliagkas,et al.  Memory Limited, Streaming PCA , 2013, NIPS.

[16]  Nathan Srebro,et al.  Stochastic Optimization of PCA with Capped MSG , 2013, NIPS.

[17]  John Langford,et al.  A reliable effective terascale linear learning system , 2011, J. Mach. Learn. Res..