ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers

The authors describe ScaLAPACK, a distributed memory version of the LAPACK software package for dense and banded matrix computations. Key design features are the use of distributed versions of the Level 3 BLAS as building blocks, and an object-oriented interface to the library routines. The square block scattered decomposition is described. The implementation of a distributed memory version of the right-looking LU factorization algorithm on the Intel Delta multicomputer is discussed, and performance results are presented that demonstrate the scalability of the algorithm.<<ETX>>

[1]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[2]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[3]  Eric F. van de Velde Data redistribution and concurrency , 1990, Parallel Comput..

[4]  Jack Dongarra,et al.  LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.

[5]  Anshul Gupta,et al.  On the scalability of FFT on parallel computers , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[6]  Robert A. van de Geijn,et al.  LAPACK for Distributed Memory Architectures: Progress Report , 1991, SIAM Conference on Parallel Processing for Scientific Computing.

[7]  Robert A. van de Geijn,et al.  Reduction to condensed form for the eigenvalue problem on distributed memory architectures , 1992, Parallel Comput..

[8]  R. van de Geijn,et al.  A look at scalable dense linear algebra libraries , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..