Modeling the effects of contention on the performance of heterogeneous applications

Fast networks have made it possible to coordinate distributed heterogeneous CPU, memory and storage resources to provide a powerful platform for executing high-performance applications. However, the performance of these applications on such systems is highly dependent on the allocation and efficient coordination of application tasks. A key component for a performance-efficient allocation strategy is a predictive model which provides a realistic estimate of application performance under varying resource loads. In this paper, we present a model for predicting the effects of contention on application behavior in heterogeneous systems. In particular, our model calculates the slowdown imposed on communication and computation for non-dedicated two-machine heterogeneous platforms. We describe the model for the Sun/CM2 and Sun/Paragon coupled heterogeneous systems. We present experiments on production systems with emulated contention which show the predicted communication and computation costs to be within 15% on average of the actual costs.

[1]  S. T. Leutenegger,et al.  Distributed computing feasibility in a non-dedicated homogeneous distributed system , 1993, Supercomputing '93.

[2]  Reagan Moore,et al.  A Batch Scheduler for the Intel Paragon MPP System with a Non-contiguous Node Allocation Algorithm , 1996, JSSPP.

[3]  Jon B. Weissman,et al.  The Interference Paradigm for Network Job Scheduling , 1996 .

[4]  Barbara Katherine Pasquale Characterizing the I/O behavior of scientific applications , 1996 .

[5]  Francine Berman,et al.  Zoom: a Hierarchical Representation for Heterogeneous Applications , 1995 .

[6]  Dan C. Marinescu,et al.  Models and Algorithms for Coscheduling Compute-Intensive Tasks on a Network of Workstations , 1992, J. Parallel Distributed Comput..

[7]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[8]  Carlos R. Mechoso,et al.  Running a climate model in a heterogeneous, distributed computer environment , 1994, Proceedings of 3rd IEEE International Symposium on High Performance Distributed Computing.

[9]  Jack J. Dongarra,et al.  Graphical development tools for network-based concurrent supercomputing , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[10]  Francine Berman,et al.  Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[11]  D. J. Lilja,et al.  Experiments with a Task Partitioning Model for Heterogeneous Computing , 1993, Proceedings. Workshop on Heterogeneous Processing,.

[12]  J. Ben Rosen,et al.  Molecular structure determination by convex, global underestimation of local energy minima , 1995, Global Minimization of Nonconvex Energy Functions: Molecular Conformation and Protein Folding.

[13]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[14]  Henry G. Dietz,et al.  Would You Run it Here or There? AHS: Automatic Heterogeneous Supercomputing , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[15]  Brigitte Plateau,et al.  Modelling of Communication Contention in Multiprocessors , 1994, Computer Performance Evaluation.

[16]  Bill Nitzberg,et al.  Non-contiguous processor allocation algorithms for distributed memory multicomputers , 1994, Proceedings of Supercomputing '94.