Gaussian elimination with partial pivoting and load balancing on a multiprocessor

Abstract A row-oriented implementation of Gaussian elimination with partial pivoting on a local-memory multiprocessor is described. In the absence of pivoting, the initial data loading of the node processors leads to a balanced computation. However, if interchanges occur, the computational loads on the processors may become unbalanced, leading to inefficiency. A simple load-balancing scheme is described which is inexpensive and which maintains computational balance in the presence of pivoting. Using some reasonable assumptions about the probability of pivoting occurring, an analysis of the communication costs of the algorithm is developed, along with an analysis of the computation performed in each node processor. This model is then used to derive the expected speedup of the algorithm. Finally, experiments using an Intel hypercube are presented in order to demonstrate the extent to which the analytical model predicts the performance.

[1]  Thomas F. Coleman,et al.  A Parallel Triangular Solver for a Hypercube Multiprocessor , 1986 .

[2]  James M. Ortega,et al.  Parallel solution of triangular systems of equations , 1988, Parallel Comput..

[3]  G. A. Geist,et al.  Parallel Cholesky factorization on a hypercube multiprocessor , 1985 .

[4]  James Wong COMPUTER SCIENCE DEPARTMENT , 1971 .

[5]  N. Meyers,et al.  H = W. , 1964, Proceedings of the National Academy of Sciences of the United States of America.

[6]  V. Klema LINPACK user's guide , 1980 .

[7]  M. Heath,et al.  Matrix factorization on a hypercube multiprocessor , 1985 .