Modelling the runtime of scientific programs on parallel computers

Message passing programs for parallel machines with a distributed address space using communication operations of portable communication libraries such as PVM and MPI guarantee portability for most of the parallel machines available. But due to specific implementations of the runtime libraries on each specific platforms, the same program may have different runtimes on different machines, i.e., the efficiency is often not portable. Quantitative results about the runtimes of collective communication operations on specific parallel machines can therefore be helpful. We show that the execution time of collective communication operations can be modelled by runtime functions in closed form. As example, we consider the Cray T3D and T3E machines. We demonstrate that the runtime functions can be used to model the computation and communication behavior of complete programs by investigating the parallel implementation of a solution method for ordinary differential equations. As application, we consider the use of this method for solving a time-dependent reaction-diffusion equation.

[1]  Thomas Rauber,et al.  Parallel Implementations of Iterated Runge-Kutta Methods , 1996, Int. J. High Perform. Comput. Appl..

[2]  Thomas Rauber,et al.  Parallel iterated Runge-Kutta methods and applications , 1994 .

[3]  Jack Dongarra,et al.  A User''s Guide to PVM Parallel Virtual Machine , 1991 .

[4]  P. Houwen,et al.  Parallel iteration of high-order Runge-Kutta methods with stepsize control , 1990 .

[5]  Thomas Rauber,et al.  Parallel solution of stiff ordinary differential equations , 1999 .

[6]  Zhiwei Xu,et al.  Early Prediction of MPP Performance: Th SP2, T3D, and Paragon Experiences , 1996, Parallel Comput..

[7]  Andrew D. Gordon,et al.  CRI/EPCC MPI for CRAY T3D , 1995 .

[8]  Zhiwei Xu,et al.  Benchmark Evaluation of the IBM SP2 for Parallel Signal Processing , 1996, IEEE Trans. Parallel Distributed Syst..

[9]  Thomas Rauber,et al.  Deriving structured parallel implementations for numerical methods , 1996, Microprocess. Microprogramming.

[10]  S. Lennart Johnsson,et al.  Performance Modeling of Distributed Memory Architectures , 1991, J. Parallel Distributed Comput..

[11]  Allan Gottlieb,et al.  Highly parallel computing , 1989, Benjamin/Cummings Series in computer science and engineering.

[12]  Thomas Rauber,et al.  Modeling the communication behavior of the Intel Paragon , 1997, Proceedings Fifth International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.