A New Method for Solving Triangular Systems on Distributed Memory Message-Passing Multiprocessors

Efficient triangular solvers for use on message passing multiprocessors are required, in several contexts, under the assumption that the matrix is distributed by columns (or rows) in a wrap fashion. In this paper we describe a new efficient parallel triangular solver for this problem. This new algorithm is based on the previous method of Li and Coleman [1986] but is considerably more efficient when $\frac{n}{p}$ is relatively modest, where $p$ is the number of processors and $n$ is the problem dimension. A useful theoretical analysis is provided as well as extensive numerical results obtained on an Intel iPSC with $p \leq 128$.