Performance Modeling of a Cluster of Workstations

Using off-the-shelf commodity workstations to build a cluster for parallel computing has become a common practice. In studying or designing a cluster of workstations one should have available a robust analytical model that includes the major parameters that determines the cluster performance. In this paper, we present such a model for evaluating a cluster’s performance. The model covers the effect of storage limitations, interconnection networks and the impact of data partitioning. The model can be used to estimate the throughput of the cluster or the expected service time of the tasks under any specific configuration. It also, can detect the bottlenecks in the system, which can lead to more effective utilization of the available resources. The model (Multi-Class Jackson Network) we use can be considered as the base line for the cluster architecture analysis because it models the system behavior without using any special task or scheduling algorithms.

[1]  Noé Lopez-Benitez,et al.  Stochastic Petri nets applied to the performance evaluation of static task allocations in heterogeneous computing environments , 1997, Proceedings Sixth Heterogeneous Computing Workshop (HCW'97).

[2]  Jeffrey P. Buzen,et al.  Queueing Network Models of Multiprogramming , 1971, Outstanding Dissertations in the Computer Sciences.

[3]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[4]  K. Mani Chandy,et al.  Open, Closed, and Mixed Networks of Queues with Different Classes of Customers , 1975, JACM.

[5]  Yong Yan,et al.  Modeling and characterizing parallel computing performance on heterogeneous networks of workstations , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[6]  Giuseppe Serazzi,et al.  Asymptotic Analysis of Multiclass Closed Queueing Networks: Multiple Bottlenecks , 1997, Perform. Evaluation.

[7]  W. J. Gordon,et al.  Closed Queuing Systems with Exponential Servers , 1967, Oper. Res..

[8]  James R. Jackson,et al.  Jobshop-Like Queueing Systems , 2004, Manag. Sci..

[9]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[10]  Francine Berman,et al.  A Slowdown Model for Applications Executing on Time-Shared Clusters of Workstations , 2001, IEEE Trans. Parallel Distributed Syst..

[11]  Bernd Mohr,et al.  Speedy: An Integrated Performance Extrapolation Tool for pC++ Programs , 1995, MMB.

[12]  Yan Alexander Li,et al.  Estimating the execution time distribution for a task graph in a heterogeneous computing system , 1997, Proceedings Sixth Heterogeneous Computing Workshop (HCW'97).

[13]  F. R. Moore,et al.  Computational model of a closed queuing network with exponential servers , 1972 .

[14]  Marco Ajmone Marsan,et al.  A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems , 1984, TOCS.

[15]  Peter A. Dinda Online prediction of the running time of tasks , 2001, SIGMETRICS '01.

[16]  David Abramson,et al.  Nimrod: a tool for performing parametrised simulations using distributed workstations , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[17]  Jeffrey P. Buzen,et al.  Computational algorithms for closed queueing networks with exponential servers , 1973, Commun. ACM.

[18]  Lester Lipsky,et al.  Applications of a Queueing Network Model for a Computer System , 1977, CSUR.

[19]  Kishor S. Trivedi,et al.  Petri Nets with k Simultaneously Enabled Generally Distributed Timed Transitions , 1998, Perform. Evaluation.

[20]  Hoon Choi,et al.  Performance Evaluation of Client-Server Systems , 1993, IEEE Trans. Parallel Distributed Syst..

[21]  Lee C. Potter,et al.  Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[22]  Elizabeth Varki,et al.  Response Time Analysis of Parallel Computer and Storage Systems , 2001, IEEE Trans. Parallel Distributed Syst..

[23]  Ajmone MarsanMarco,et al.  A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems , 1984 .

[24]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..