Split-and-Combine Singular Value Decomposition for Large-Scale Matrix

The singular value decomposition (SVD) is a fundamental matrix decomposition in linear algebra. It is widely applied in many modern techniques, for example, high- dimensional data visualization, dimension reduction, data mining, latent semantic analysis, and so forth. Although the SVD plays an essential role in these fields, its apparent weakness is the order three computational cost. This order three computational cost makes many modern applications infeasible, especially when the scale of the data is huge and growing. Therefore, it is imperative to develop a fast SVD method in modern era. If the rank of matrix is much smaller than the matrix size, there are already some fast SVD approaches. In this paper, we focus on this case but with the additional condition that the data is considerably huge to be stored as a matrix form. We will demonstrate that this fast SVD result is sufficiently accurate, and most importantly it can be derived immediately. Using this fast method, many infeasible modern techniques based on the SVD will become viable.

[1]  Lars Eldén,et al.  Partial least-squares vs. Lanczos bidiagonalization - I: analysis of a projection method for multiple regression , 2004, Comput. Stat. Data Anal..

[2]  Tony F. Chan,et al.  An Improved Algorithm for Computing the Singular Value Decomposition , 1982, TOMS.

[3]  Matthew Chalmers,et al.  A linear iteration time layout algorithm for visualising high-dimensional data , 1996, Proceedings of Seventh Annual IEEE Visualization '96.

[4]  Josef Stoer,et al.  Numerische Mathematik 1 , 1989 .

[5]  J. Demmel,et al.  The bidiagonal singular value decomposition and Hamiltonian mechanics: LAPACK working note No. 11 , 1989 .

[6]  M. Brand,et al.  Fast low-rank modifications of the thin singular value decomposition , 2006 .

[7]  Lothar Reichel,et al.  Restarted block Lanczos bidiagonalization methods , 2007, Numerical Algorithms.

[8]  Jengnan Tzeng,et al.  Multidimensional scaling for large genomic data sets , 2008, BMC Bioinformatics.

[9]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[10]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[11]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[12]  E. Aronson,et al.  Theory and method , 1985 .

[13]  Lothar Reichel,et al.  Augmented Implicitly Restarted Lanczos Bidiagonalization Methods , 2005, SIAM J. Sci. Comput..

[14]  L. Eld Partial least-squares vs. Lanczos bidiagonalization—I: analysis of a projection method for multiple regression , 2004 .

[15]  Matthew Chalmers,et al.  Fast Multidimensional Scaling Through Sampling, Springs and Interpolation , 2003, Inf. Vis..