Parallel solutions of large dense linear systems using MPI

This paper first presents two implementations of parallel Gaussian elimination using MPI, one uses cyclic data mapping and pipelined point-to-point communication, the other one uses blocked data mapping and MPI collective communication. Then, theoretical performance analysis for the two implementations is given, and the impacts of different data distribution and communication methods are compared.