Breaking the barriers: two models for MPI programming

The asynchronous nature of many MPI/PVM programs does not fit the BSP model. The barrier synchronization imposed by the model restricts the range of available algorithms and their performance. Through the suppression of barriers and the generalization of the concept of superstep we propose two new models, the BSP-like and the BSP Without Barriers (BSPWB) models. While the BSP-like extends the BSP* model to programs written using collective operations, the more general BSPWB model admits the MPI/PVM parallel asynchronous programming style. As LogP, the model encourages locality but it is simpler to use. The parameters of the models and their quality are evaluated on a distributed shared memory machine, the Origin 2000 and on a distributed memory machine, the CRAY T3E. The dependence of the time spent in an h-relation is stronger in the communication pattern than in the number of processors. The total variation of the h-relation time in both the patterns and processor numbers is smaller than sixty nanoseconds. To illustrate the proposed models, two different applications are considered: a Parallel Sort using Regular Sampling (PSRS) and a Parallel Dynamic Programming Algorithm solving the Single Resource Allocation Problem (SRAP). The PSRS is a synchronous algorithm with a rich set of collective communication patterns and coarse grain communications. On the opposite extreme, the SRAP is a fine grain communication algorithm using permutation patterns. The computational results prove the accuracy of the models. The prediction of the communication times is robust even for the SRAP, where communication is dominated by small messages.