Level 3 Blas in Lu Factorization On the Cray-2, Eta-10P, and Ibm 3090-200/Vf

We study various implementations of block Gaussian elimination on full matrices and examine their perfor mance on three vector supercomputers, the CRAY-2, the ETA-10P, and the IBM 3090-200/VF. We show that the use of Level 3 BLAS kernels allows portability without sacrifice of efficiency and that good speeds can be ob tained if tuned versions of the kernels are available. In deed our results show that without using any assembler language outside the kernels we can approach the per formance of assembler-coded routines on all machines.

[1]  William Jalby,et al.  The use of BLAS3 in linear algebra on a parallel processor with a hierarchical memory , 1987 .

[2]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[3]  Charles L. Lawson,et al.  Algorithm 539: Basic Linear Algebra Subprograms for Fortran Usage [F1] , 1979, TOMS.

[4]  Donald A. Calahan,et al.  Block-Oriented, Local-Memory Based Linear Equation Solution on the Cray-2 Uniprocessor Algorithms , 1986, ICPP.

[5]  F. Gustavson,et al.  Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine , 1984 .

[6]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a Fortran environment , 1987, SGNM.

[7]  Jack J. Dongarra,et al.  Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs , 1990, TOMS.

[8]  I. Y. Bucher,et al.  Linear algebra programs for use on a vector computer with a secondary solid state storage device , 1984 .

[9]  Paolo Carnevali,et al.  Efficient fortran implementation of the gaussian elimination and householder reduction algorithms on , 1987 .

[10]  Michel J. Daydé,et al.  Use of parallel level 3 BLAS in LU factorization on three vector multiprocessors the ALLIANT FX/80, the CRAY-2, and the IBM 3090 VF , 1990, ICS '90.

[11]  William Jalby,et al.  Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design , 1988 .

[12]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[13]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[14]  Christian H. Bischof,et al.  The WY representation for products of householder matrices , 1985, PPSC.

[15]  Michel J. Daydé,et al.  Use of Level 3 Blas in Lu Factorization in a Multiprocessing Environment On Three Vector Multiprocessors: the Alliant Fx/80, the Cray-2, and the Ibm 3090 Vf , 1991, Int. J. High Perform. Comput. Appl..