Accurate Heterogeneous Communication Models and a Software Tool for Their Efficient Estimation

In this paper, we analyze the restrictions of traditional communication performance models that affect the accuracy of analytical prediction of the execution time of collective communication operations on homogeneous and heterogeneous clusters. In particular, we show that the constant and variable contributions of processors and the network are not fully separated in these models. Full separation of the contributions that have different natures and arise from different sources would lead to more intuitive and accurate models, but the parameters of such models cannot be estimated from only the point-to-point experiments, which are usually used for traditional models. The paper presents such an intuitive and accurate point-to-point model and describes a set of communication experiments sufficient for estimation of its parameters. It also presents an implementation of the new model in the form of a software tool that automates the estimation of both this model and heterogeneous extensions of traditional communication performance models. We conclude with a presentation of experimental results demonstrating that the elaborated model much more accurately predicts the execution time of different algorithms of collective operations than traditional models.

[1]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[2]  Achim Zeileis,et al.  Strucchange: An R package for testing for structural change in linear regression models , 2002 .

[3]  Jack J. Dongarra,et al.  Performance Analysis of MPI Collective Operations , 2005, IPDPS.

[4]  Alexey L. Lastovetsky,et al.  MPIBlib: Benchmarking MPI Communications for Parallel Computing on Homogeneous and Heterogeneous Clusters , 2008, PVM/MPI.

[5]  Kees Verstoep,et al.  Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.

[6]  Roger W. Hockney,et al.  The Communication Challenge for MPP: Intel Paragon and Meiko CS-2 , 1994, Parallel Computing.

[7]  Jack J. Dongarra,et al.  Performance analysis of MPI collective operations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[8]  Viktor K. Prasanna,et al.  Efficient collective communication in distributed heterogeneous systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[9]  Alexey L. Lastovetsky,et al.  Building the communication performance model of heterogeneous clusters based on a switched network , 2007, 2007 IEEE International Conference on Cluster Computing.

[10]  Alexey L. Lastovetsky,et al.  Revisiting communication performance models for computational clusters , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[11]  Alexey L. Lastovetsky,et al.  Optimization of Collective Communications in HeteroMPI , 2007, PVM/MPI.

[12]  Alexey L. Lastovetsky,et al.  An accurate communication model of a heterogeneous cluster based on a switch-enabled Ethernet network , 2006, 12th International Conference on Parallel and Distributed Systems - (ICPADS'06).

[13]  P. Perron,et al.  Computation and Analysis of Multiple Structural-Change Models , 1998 .

[14]  Alexey L. Lastovetsky,et al.  A Performance Model of Many-to-One Collective Communications for Parallel Computing , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[15]  Paul D. Gader,et al.  Image algebra techniques for parallel image processing , 1987 .

[16]  Brian Gough,et al.  GNU Scientific Library Reference Manual - Third Edition , 2003 .

[17]  Susumu Shibusawa,et al.  Scheduling algorithms for efficient gather operations in distributed heterogeneous systems , 2000, Proceedings 2000. International Workshop on Parallel Processing.

[18]  Robert A. van de Geijn,et al.  On optimizing collective communication , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[19]  Alexey L. Lastovetsky,et al.  Accurate and Efficient Estimation of Parameters of Heterogeneous Communication Performance Models , 2009, Int. J. High Perform. Comput. Appl..

[20]  Chris J. Scheiman,et al.  LogGP: Incorporating Long Messages into the LogP Model for Parallel Computation , 1997, J. Parallel Distributed Comput..

[21]  Rajeev Thakur,et al.  Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..