Parallel Implementation of Fast Randomized Algorithms for Low Rank Matrix Decomposition

We analyze the parallel performance of randomized interpolative decomposition by decomposing low rank complex-valued Gaussian random matrices of about 100 GB. We chose a Cray XMT supercomputer as it provides an almost ideal PRAM model permitting quick investigation of parallel algorithms without obfuscation from hardware idiosyncrasies. We obtain that on non-square matrices performance scales almost linearly with runtime about 100 times faster on 128 processors. We also verify that numerically discovered error bounds still hold on matrices two orders of magnitude larger than those previously tested.

[1]  Per-Gunnar Martinsson,et al.  Randomized algorithms for the low-rank approximation of matrices , 2007, Proceedings of the National Academy of Sciences.

[2]  V. Rokhlin,et al.  A randomized algorithm for the approximation of matrices , 2006 .

[3]  V. Rokhlin,et al.  A fast randomized algorithm for overdetermined linear least-squares regression , 2008, Proceedings of the National Academy of Sciences.

[4]  F. J. Lingen Efficient Gram–Schmidt orthonormalisation on parallel computers , 2000 .

[5]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[6]  Michael W. Mahoney,et al.  A randomized algorithm for a tensor-based generalization of the singular value decomposition , 2007 .

[7]  C. Chui,et al.  Article in Press Applied and Computational Harmonic Analysis a Randomized Algorithm for the Decomposition of Matrices , 2022 .

[8]  Golub Gene H. Et.Al Matrix Computations, 3rd Edition , 2007 .

[9]  Per-Gunnar Martinsson,et al.  On the Compression of Low Rank Matrices , 2005, SIAM J. Sci. Comput..

[10]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[11]  Mark Tygert,et al.  A Randomized Algorithm for Principal Component Analysis , 2008, SIAM J. Matrix Anal. Appl..

[12]  Yoshimasa Nakamura,et al.  A new algorithm for singular value decomposition and its parallelization , 2009, Parallel Comput..

[13]  Julien Langou,et al.  Rounding error analysis of the classical Gram-Schmidt orthogonalization process , 2005, Numerische Mathematik.

[14]  Walter Hoffmann,et al.  Iterative algorithms for Gram-Schmidt orthogonalization , 1989, Computing.

[15]  Å. Björck Numerics of Gram-Schmidt orthogonalization , 1994 .

[16]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .