Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication

Motivated by applications in which the data may be formulated as a matrix, we consider algorithms for several common linear algebra problems. These algorithms make more efficient use of computational resources, such as the computation time, random access memory (RAM), and the number of passes over the data, than do previously known algorithms for these problems. In this paper, we devise two algorithms for the matrix multiplication problem. Suppose $A$ and $B$ (which are $m\times n$ and $n\times p$, respectively) are the two input matrices. In our main algorithm, we perform $c$ independent trials, where in each trial we randomly sample an element of $\{ 1,2,\ldots, n\}$ with an appropriate probability distribution ${\cal P}$ on $\{ 1,2,\ldots, n\}$. We form an $m\times c$ matrix $C$ consisting of the sampled columns of $A$, each scaled appropriately, and we form a $c\times n$ matrix $R$ using the corresponding rows of $B$, again scaled appropriately. The choice of ${\cal P}$ and the column and row scaling are crucial features of the algorithm. When these are chosen judiciously, we show that $CR$ is a good approximation to $AB$. More precisely, we show that $$ \left\|AB-CR\right\|_F = O(\left\|A\right\|_F \left\|B\right\|_F /\sqrt c) , $$ where $\|\cdot\|_F$ denotes the Frobenius norm, i.e., $\|A\|^2_F=\sum_{i,j}A_{ij}^2$. This algorithm can be implemented without storing the matrices $A$ and $B$ in RAM, provided it can make two passes over the matrices stored in external memory and use $O(c(m+n+p))$ additional RAM to construct $C$ and $R$. We then present a second matrix multiplication algorithm which is similar in spirit to our main algorithm. In addition, we present a model (the pass-efficient model) in which the efficiency of these and other approximate matrix algorithms may be studied and which we argue is well suited to many applications involving massive data sets. In this model, the scarce computational resources are the number of passes over the data and the additional space and time required by the algorithm. The input matrices may be presented in any order of the entries (and not just row or column order), as is the case in many applications where, e.g., the data has been written in by multiple agents. In addition, the input matrices may be presented in a sparse representation, where only the nonzero entries are written.

[1]  V. Strassen Gaussian elimination is not optimal , 1969 .

[2]  J. Ian Munro,et al.  Selection and sorting with limited storage , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[3]  János Komlós,et al.  The eigenvalues of random symmetric matrices , 1981, Comb..

[4]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[5]  Jeffrey Scott Vitter,et al.  Random sampling with a reservoir , 1985, TOMS.

[6]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[7]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[8]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[9]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[10]  Edith Cohen,et al.  Approximating matrix multiplication for pattern recognition tasks , 1997, SODA '97.

[11]  Peter J. Haas,et al.  The New Jersey Data Reduction Report , 1997 .

[12]  Prabhakar Raghavan,et al.  Computing on data streams , 1999, External Memory Algorithms.

[13]  Coordinate Restrictions of Linear Operators in L , 2000 .

[14]  N. Alon,et al.  On the concentration of eigenvalues of random symmetric matrices , 2000, math-ph/0009032.

[15]  Petros Drineas,et al.  Fast Monte-Carlo algorithms for approximate matrix multiplication , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[16]  Dimitris Achlioptas,et al.  Fast computation of low rank matrix approximations , 2001, STOC '01.

[17]  Jessica H. Fong,et al.  An Approximate Lp Difference Algorithm for Massive Data Streams , 1999, Discret. Math. Theor. Comput. Sci..

[18]  Sudipto Guha,et al.  Fast, small-space algorithms for approximate histogram maintenance , 2002, STOC '02.

[19]  Ziv Bar-Yossef,et al.  Sampling lower bounds via information theory , 2003, STOC '03.

[20]  Petros Drineas,et al.  Pass efficient algorithms for approximating large matrices , 2003, SODA '03.

[21]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[22]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[23]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES III: COMPUTING A COMPRESSED APPROXIMATE MATRIX DECOMPOSITION∗ , 2004 .