Abstract A technique is presented for solving dense systems of linear equations by LU factorization with maximum performance on processors like FPS-120, FPS-5000 and X64 series, using FORTRAN with calls to elementary vector routines. A rearrangement of the matrix elements is done in order to compute all the matrix-vector operations involved in the LU factorization with only stride-1 dot-product operations, which are executed at peak speed in the FPS processors. Since only vector instructions are used, the algorithm is fully portable on all FPS 38/64 bit machines and in general on all vector computers with a similar memory structure. The performance obtained on FPS-100 and FPS M64/60 (FPS-264) processors is reported: the asymptotic speed r ∞ is always the peak speed of the machine and the half performance length is N 1 2 = 238 for the FPS-100 and N 1 2 = 200 for the FPS M64/60. The N 1 2 - values could be lowered by using the APAL Assembly Language to code some critical parts, losing however the code portability.
[1]
Alan E. Charlesworth,et al.
An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family
,
1981,
Computer.
[2]
Jack J. Dongarra,et al.
A proposal for an extended set of Fortran Basic Linear Algebra Subprograms
,
1985,
SGNM.
[3]
Jack J. Dongarra,et al.
Performance of various computers using standard linear equations software in a Fortran environment
,
1987,
SGNM.
[4]
F. Gustavson,et al.
Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine
,
1984
.
[5]
Jack J. Dongarra,et al.
Squeezing the most out of an algorithm in CRAY FORTRAN
,
1984,
ACM Trans. Math. Softw..