LWI-SVD: low-rank, windowed, incremental singular value decompositions on time-evolving data sets

Singular Value Decomposition (SVD) is computationally costly and therefore a naive implementation does not scale to the needs of scenarios where data evolves continuously. While there are various on-line analysis and incremental decomposition techniques, these may not accurately represent the data or may be slow for the needs of many applications. To address these challenges, in this paper, we propose a Low-rank, Windowed, Incremental SVD (LWI-SVD) algorithm, which (a) leverages efficient and accurate low-rank approximations to speed up incremental SVD updates and (b) uses a window-based approach to aggregate multiple incoming updates (insertions or deletions of rows and columns) and, thus, reduces on- line processing costs. We also present an LWI-SVD with restarts (LWI2-SVD) algorithm which leverages a novel highly efficient partial reconstruction based change detection scheme to support timely refreshing of the decomposition with significant changes in the data and prevent accumulation of errors over time. Experiment results, including comparisons to other state of the art techniques on different data sets and under different parameter settings, confirm that LWI-SVD and LWI2-SVD are both efficient and accurate in maintaining decompositions.

[1]  Michael W. Berry,et al.  Algorithm 844: Computing sparse reduced-rank approximations to sparse matrices , 2005, TOMS.

[2]  G. W. Stewart,et al.  Four algorithms for the the efficient computation of truncated pivoted QR approximations to a sparse matrix , 1999, Numerische Mathematik.

[3]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[4]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[5]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[6]  Jimeng Sun,et al.  Online latent variable detection in sensor networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[7]  Hiroshi Motoda,et al.  Book Review: Computational Methods of Feature Selection , 2007, The IEEE intelligent informatics bulletin.

[8]  Gavin W. O''Brien,et al.  Information Management Tools for Updating an SVD-Encoded Indexing Scheme , 1994 .

[9]  Hongyuan Zha,et al.  On Updating Problems in Latent Semantic Indexing , 1997, SIAM J. Sci. Comput..

[10]  Stanley C. Eisenstat,et al.  Downdating the Singular Value Decomposition , 1995, SIAM J. Matrix Anal. Appl..

[11]  Michael W. Berry,et al.  Downdating the Latent Semantic Indexing Model for Conceptual Information Retrieval , 1998, Comput. J..

[12]  B. S. Manjunath,et al.  An Eigenspace Update Algorithm for Image Analysis , 1997, CVGIP Graph. Model. Image Process..

[13]  Jimeng Sun,et al.  Streaming Pattern Discovery in Multiple Time-Series , 2005, VLDB.

[14]  M. Brand,et al.  Fast low-rank modifications of the thin singular value decomposition , 2006 .

[15]  G. Karypis,et al.  Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems , 2002 .

[16]  Huan Liu,et al.  Spectral Feature Selection for Data Mining , 2011 .

[17]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..