Automatic Parallelization of the Conjugate Gradient Algorithm

The conjugate gradient (CG) method is a popular Krylov space method for solving systems of linear equations of the form Ax = b, where A is a symmetric positive-definite matrix. This method can be applied regardless of whether A is dense or sparse. In this paper, we show how restructuring compiler technology can be applied to transform a sequential, dense matrix CG program into a parallel, sparse matrix CG program. On the IBM SP-2, the performance of our compiled code is comparable to that of handwritten code from the PETSc library at Argonne.

[1]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[2]  Joel H. Saltz,et al.  Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures , 1994, J. Parallel Distributed Comput..

[3]  Harry Berryman,et al.  Distributed Memory Compiler Design for Sparse Problems , 1995, IEEE Trans. Computers.

[4]  PothenAlex,et al.  Partitioning sparse matrices with eigenvectors of graphs , 1990 .

[5]  Joel H. Saltz,et al.  Distributed memory compiler methods for irregular problems—data copy reuse and runtime partitioning , 1992 .

[6]  Joel H. Saltz,et al.  Run-time and compile-time support for adaptive irregular problems , 1994, Proceedings of Supercomputing '94.

[7]  D. Rose,et al.  Generalized nested dissection , 1977 .

[8]  Gary L. Miller,et al.  Automatic Mesh Partitioning , 1992 .

[9]  B. Smith,et al.  Portable, parallel, reusable Krylov space codes , 1994 .

[10]  Achim Basermann Parallel Sparse Matrix Computations in Iterative Solvers on Distributed Memory Machines , 1995, PPSC.

[11]  Aart J. C. Bik,et al.  Reshaping Access Patterns for Generating Sparse Codes , 1994, LCPC.

[12]  Paul Feautrier Toward Automatic Distribution , 1994, Parallel Process. Lett..

[13]  A. George,et al.  Graph theory and sparse matrix computation , 1993 .

[14]  Anne Rogers,et al.  Process decomposition through locality of reference , 1989, PLDI '89.

[15]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[16]  Monica S. Lam,et al.  Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.

[17]  Y. Saad,et al.  Krylov Subspace Methods on Supercomputers , 1989 .

[18]  Aart J. C. Bik,et al.  Compilation techniques for sparse matrix computations , 1993, ICS '93.

[19]  Yousef Saad,et al.  SPARK: a benchmark package for sparse computations , 1990, ICS '90.

[20]  Geoffrey C. Fox,et al.  RUNTIME SUPPORT AND COMPILATION METHODS FOR USER-SPECIFIED DATE DISTRIBUTIONS , 1993 .

[21]  A. J. C. Bik,et al.  Advanced compiler optimizations for sparse computations , 1993, Supercomputing '93.