Revisiting communication performance models for computational clusters

In this paper, we analyze restrictions of traditional models affecting the accuracy of analytical prediction of the execution time of collective communication operations. In particular, we show that the constant and variable contributions of processors and network are not fully separated in these models. Full separation of the contributions that have different nature and arise from different sources will lead to more intuitive and accurate models, but the parameters of such models cannot be estimated from only the point-to-point experiments, which are usually used for traditional models. We are making the point that all the traditional models are designed so that their parameters can be estimated from a set of point-to-point communication experiments. In this paper, we demonstrate that the more intuitive models allow for much more accurate analytical prediction of the execution time of collective communication operations on both homogeneous and heterogeneous clusters. We present in detail one such a point-to-point model and how it can be used for prediction of the execution time of scatter and gather. We describe a set of communication experiments sufficient for accurate estimation of its parameters, and we conclude with presentation of experimental results demonstrating that the model much more accurately predicts the execution time of collective operations than traditional models.

[1]  Alexey L. Lastovetsky,et al.  Optimization of Collective Communications in HeteroMPI , 2007, PVM/MPI.

[2]  Alexey L. Lastovetsky,et al.  Building the communication performance model of heterogeneous clusters based on a switched network , 2007, 2007 IEEE International Conference on Cluster Computing.

[3]  Luiz Angelo Steffenel,et al.  Fast Tuning of Intra-cluster Collective Communications , 2004, PVM/MPI.

[4]  Alexey L. Lastovetsky,et al.  A Performance Model of Many-to-One Collective Communications for Parallel Computing , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[5]  Alexey L. Lastovetsky,et al.  MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters , 2008, PVM/MPI.

[6]  Rajeev Thakur,et al.  Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..

[7]  Robert A. van de Geijn,et al.  On optimizing collective communication , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[8]  Susumu Shibusawa,et al.  Scheduling algorithms for efficient gather operations in distributed heterogeneous systems , 2000, Proceedings 2000. International Workshop on Parallel Processing.

[9]  Kees Verstoep,et al.  Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.

[10]  Alexey L. Lastovetsky,et al.  A Software Tool for Accurate Estimation of Parameters of Heterogeneous Communication Models , 2008, PVM/MPI.

[11]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[12]  Chris J. Scheiman,et al.  LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.

[13]  Roger W. Hockney,et al.  The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.

[14]  Jack J. Dongarra,et al.  Performance analysis of MPI collective operations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[15]  Alexey L. Lastovetsky,et al.  An accurate communication model of a heterogeneous cluster based on a switch-enabled Ethernet network , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).