Bulk-Synchronous Parallel Gaussian Elimination

The model of bulk-synchronous parallel (BSP) computation is an emerging paradigm of general-purpose parallel computing. We study the BSP complexity of Gaussian elimination and related problems. First, we analyze the Gaussian elimination without pivoting, which can be applied to the LU decomposition of symmetric positive-definite or diagonally dominant real matrices. Then we analyze the Gaussian elimination with Schönhage's recursive local pivoting suitable for the LU decomposition of matrices over a finite field, and for the QR decomposition of real matrices by the Givens rotations. Both versions of Gaussian elimination can be performed with an optimal amount of local computation, but optimal communication and synchronization costs cannot be achieved simultaneously. The algorithms presented in the paper allow one to trade off communication and synchronization costs in a certain range, achieving optimal or near-optimal cost values at the extremes. Bibliography: 19 titles.