Parallelization of a Block Tridiagonal Solver in HPF on an IBM SP2

The aim of this article is to describe how an existing FORTRAN-code for the solution of block tridiagonal systems of linear equations can be parallelized by using High Performance Fortran (HPF). The algorithm we consider in this paper, consists of a complete LU decomposition. In order to obtain a well-parallelizable algorithm, a simultaneous reordering of both rows and columns of the coefficient matrix is performed before the LU decomposition is constructed. Numerical results obtained on an IBM SP2 using the x1hpf-compiler will be compared with numerical results obtained on a Cray T3D.