Basic Matrix Subprograms for Distributed Memory Systems

Parallel systems are in general complicated to utilize eficiently. As they evolve in complexity, it hence becomes increasingly more important to provide libraries and language features that can spare the users from the knowledge of low-level system details. Our effort in this direction is to develop a set of basic matrix algorithms for distributed memory systems such as the hypercube. The goal is to be able to provide for distributed memory systems an environment similar to that which the Level-3 Basic Linear Algebra Subprograms (BLAS3) provide for the sequential and shared memory environments. These subprograms facilitate the development of eficient and portable algorithms that are rich in matrix-matrix multiplication, on which major software eflorts such as LAPACK have been built. To demonstrate the concept, some of these Level-3 algorithms are being developed on the Intel iPSC/2 hypercube. Central to this effort is the General Matrix-Matrix Multiplication routine PGEMM. The symmetric and triangular multiplications as well as, rank-tk updates (symmetric case), and the solution of triangular systems with multiple right hand sides, are also discussed.