Use of Level 3 Blas in Lu Factorization in a Multiprocessing Environment On Three Vector Multiprocessors: the Alliant Fx/80, the Cray-2, and the Ibm 3090 Vf

We study various implementations of block Gaussian elimination on full matrices and examine their perfor mance on three parallel computers, the Alliant FX/80, the CRAY-2, and the IBM 3090-400/VF. These imple mentations are expressed in terms of Level 3 BLAS matrix-matrix kernels. We consider the use of parallel Level 3 BLAS kernels and compare the parallelism ob tained within the computational kernels with that ob tained when parallelizing over the kernels. We show that the use of parallel Level 3 BLAS allows portability without sacrifice of efficiency, even in a parallel envi ronment, and that high speeds can be obtained if tuned versions of the kernels are available.

[1]  Christian H. Bischof,et al.  The WY representation for products of householder matrices , 1985, PPSC.

[2]  I. Y. Bucher,et al.  Linear algebra programs for use on a vector computer with a secondary solid state storage device , 1984 .

[3]  Donald A. Calahan,et al.  Block-Oriented, Local-Memory Based Linear Equation Solution on the Cray-2 Uniprocessor Algorithms , 1986, ICPP.

[4]  Giuseppe Radicati di Brozolo,et al.  Portable and efficient factorization algorithms on the IBM 3090/VF , 1989, ICS '89.

[5]  F. Gustavson,et al.  Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine , 1984 .

[6]  Jack J. Dongarra,et al.  Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs , 1990, TOMS.

[7]  William Jalby,et al.  Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .

[8]  Anne Greenbaum,et al.  LAPACK Working Note #5 : Provisional Contents , 1988 .

[9]  K. A. Gallivan,et al.  Parallel Algorithms for Dense Linear Algebra Computations , 1990, SIAM Rev..

[10]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a Fortran environment , 1987, SGNM.

[11]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[12]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[13]  Charles L. Lawson,et al.  Algorithm 539: Basic Linear Algebra Subprograms for Fortran Usage [F1] , 1979, TOMS.

[14]  Patrick Amestoy,et al.  Vectorization of a Multiprocessor Multifrontal Code , 1989, Int. J. High Perform. Comput. Appl..

[15]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[16]  William Jalby,et al.  The use of BLAS3 in linear algebra on a parallel processor with a hierarchical memory , 1987 .

[17]  Michel J. Daydé,et al.  Level 3 Blas in Lu Factorization On the Cray-2, Eta-10P, and Ibm 3090-200/Vf , 1989, Int. J. High Perform. Comput. Appl..