We consider solving triangular systems of linear equations on a distributed-memory multiprocessor which allows for a ring embedding. Specifically, we propose a parallel algorithm, applicable when the triangular matrix is distributed by column in a wrap fashion. Numerical experiments indicate that the new algorithm is very efficient in some circumstances (in particular, when the size of the problem is sufficiently large relative to the number of processors).A theoretical analysis confirms that the total running time varies linearly, with respect to the matrix order, up to a threshold value of the matrix order, after which the dependence is quadratic. Moreover, we show that total message traffic is essentially the minimum possible.Finally, we describe an analogous row-oriented algorithm.
[1]
Charles L. Lawson,et al.
Basic Linear Algebra Subprograms for Fortran Usage
,
1979,
TOMS.
[2]
Oliver A. McBryan,et al.
Hypercube Algorithms and Implementations
,
1985,
PPSC.
[3]
G. A. Geist,et al.
Parallel Cholesky factorization on a hypercube multiprocessor
,
1985
.
[4]
James M. Ortega,et al.
Parallel solution of triangular systems of equations
,
1988,
Parallel Comput..
[5]
Michael T. Heath,et al.
Parallel solution of triangular systems on distributed-memory multiprocessors
,
1988
.
[6]
Thomas F. Coleman,et al.
A New Method for Solving Triangular Systems on Distributed Memory Message-Passing Multiprocessors
,
1989
.