ROIPCA: An Online PCA algorithm based on rank-one updates

Principal components analysis (PCA) is a fundamental algorithm in data analysis. Its online version is useful in many modern applications where the data are too large to fit in memory, or when speed of calculation is important. In this paper we propose ROIPCA, an online PCA algorithm based on rank-one updates. ROIPCA is linear in both the dimension of the data and the number of components calculated. We demonstrate its advantages over existing state-of-the-art algorithms in terms of accuracy and running time.

[1]  Nathan Halko,et al.  An Algorithm for the Principal Component Analysis of Large Data Sets , 2010, SIAM J. Sci. Comput..

[2]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[3]  George A. F. Seber,et al.  Linear regression analysis , 1977 .

[4]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[5]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[6]  Yoel Shkolnisky,et al.  Symmetric Rank-One Updates from Partial Spectrum with an Application to Out-of-Sample Extension , 2017, SIAM J. Matrix Anal. Appl..

[7]  G. Stewart Perturbation theory for the singular value decomposition , 1990 .

[8]  Juyang Weng,et al.  Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  David Degras,et al.  Online Principal Component Analysis in High Dimension: Which Algorithm to Choose? , 2015, ArXiv.

[10]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[11]  Nathan Srebro,et al.  Stochastic optimization for PCA and PLS , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  R. Ash,et al.  Topics in stochastic processes , 1975 .

[13]  P. Stange On the efficient update of the Singular Value Decomposition , 2008 .

[14]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[15]  Aihui Zhou,et al.  Eigenvalues of rank-one updated matrices with some applications , 2007, Appl. Math. Lett..

[16]  Hans C. Jessen,et al.  Applied Logistic Regression Analysis , 1996 .

[17]  J. Bunch,et al.  Rank-one modification of the symmetric eigenproblem , 1978 .