Benchmarking message passing performance using MPI

This paper describes a set of microbenchmarks for measuring the performance characteristics of MPI implementations. We explain the rationale behind the benchmarks and present the benchmark results on the Intel Paragon, IBM SP2, and SGI Power Challenge. Our measurements reveal how the hardware architecture and the underlying MPI implementation affect the message passing performance on the three platforms.