An Analysis of Popular Mpi Implementations

| T h e MPI 1:1 deenition includes routines for nonblocking point-to-point communication that are intended to support the overlap of communication with computation. We describe two experiments that test the ability o f M P I implementations to actually perform this overlap. One experiment tests synchronization overlap, and the other tests data-transfer overlap. We g i v e results for vendor-supplied MPI implementations on the CRAY T3E, IBM SP, and SGI Origin2000 at the CEWES MSRC, along with results for MPICH on the T3E. All the implementations show full support for synchronization overlap. Conversely, n o n e of them support data-transfer overlap at the level needed for signiicant performance improvement in our experiment. We suggest that programming for overlap may n o t b e w orth-while for a broad class of parallel applications using many current MPI implementations.