Efficient Matrix Multiplication on SIMD Computers

Efficient algorithms are described for matrix multiplication on SIMD computers. SIMD implementations of Winograd’s algorithm are considered in the case where additions are faster than multiplications. Classical kernels and the use of Strassen’s algorithm are also considered. Actual performance figures using the MasPar family of SIMD computers are presented and discussed.

[1]  David H. Bailey,et al.  Extra high speed matrix multiplication on the Cray-2 , 1988 .

[2]  Christian H. Bischof,et al.  Fundamental Linear Algebra Computations on High- Performance Computers , 1990, Supercomputer.

[3]  Sartaj Sahni,et al.  Parallel Matrix and Graph Algorithms , 1981, SIAM J. Comput..

[4]  John R. Nickolls,et al.  The design of the MasPar MP-1: a cost effective massively parallel computer , 1990, Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.

[5]  R. Brent Error analysis of algorithms for matrix multiplication and triangular decomposition using Winograd's identity , 1970 .

[6]  V. Strassen Gaussian elimination is not optimal , 1969 .

[7]  R. Brent Algorithms for matrix multiplication , 1970 .

[8]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[9]  Jack J. Dongarra,et al.  Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs , 1990, TOMS.

[10]  Thomas Kailath,et al.  A Family of New Efficient Arrays for Matrix Multiplication , 1989, IEEE Trans. Computers.

[11]  Peter Christy,et al.  Software to support massively parallel computing on the MasPar MP-1 , 1990, Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.

[12]  W. Morven Gentleman,et al.  Some Complexity Results for Matrix Computations on Parallel Processors , 1978, JACM.

[13]  Nicholas J. Higham Stability of a Method for Multiplying Complex Matrices with Three Real Matrix Multiplications , 1992, SIAM J. Matrix Anal. Appl..

[14]  Tom Blank,et al.  The MasPar MP-1 architecture , 1990, Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage.

[15]  Shmuel Winograd,et al.  A New Algorithm for Inner Product , 1968, IEEE Transactions on Computers.

[16]  Lynn Elliot Cannon,et al.  A cellular computer to implement the kalman filter algorithm , 1969 .

[17]  Michael Metcalf,et al.  Fortran 90 Explained , 1990 .

[18]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..