Model-Driven Simulation of Grid Scheduling Strategies

Simulation studies of grid scheduling strategies require representative workloads to produce dependable results. Real production grid workloads have shown diverse correlation structures and scaling behavior, which are different than the characteristics of the available supercomputer workloads and cannot be captured by Poisson or simple distribution-based models. We present models that are able to reproduce various correlation structures, including pseudo-periodicity and long range dependence. By conducting model-driven simulation, we quantitatively evaluate the performance impacts of workload correlations in grid scheduling. The results indicate that autocorrelations in workloads result in worse system performance, both at the local and the grid level. It is shown that realistic workload modeling is not only possible, but also necessary to enable dependable grid scheduling studies.

[1]  R. F. Freund,et al.  Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems , 1999, J. Parallel Distributed Comput..

[2]  Richard G. Baraniuk,et al.  A Multifractal Wavelet Model with Application to Network Traffic , 1999, IEEE Trans. Inf. Theory.

[3]  Walter Willinger,et al.  Self-Similar Network Traffic and Performance Evaluation , 2000 .

[4]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[5]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[6]  Patrice Abry,et al.  Wavelets for the Analysis, Estimation, and Synthesis of Scaling Data , 2002 .

[7]  Kavitha Ranganathan,et al.  Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[8]  Anca I. D. Bucur,et al.  Trace-based simulations of processor co-allocation policies in multiclusters , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[9]  Shanshan Song,et al.  Trusted Grid Computing with Security Binding and Trust Integration , 2005, Journal of Grid Computing.

[10]  Stephen A. Jarvis,et al.  Mapping DAG-based applications to multiclusters with background workload , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[11]  M. Teich,et al.  Fractal-Based Point Processes , 2005 .

[12]  David Abramson,et al.  Scheduling parameter sweep applications on global Grids: a deadline and budget constrained cost–time optimization algorithm , 2005, Softw. Pract. Exp..

[13]  Ian T. Foster,et al.  DI-GRUBER: A Distributed Approach to Grid Resource Brokering , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[14]  Steven B. Lowen,et al.  Fractal-Based Point Processes , 2005 .

[15]  A Set Coverage-based Mapping Heuristic for Scheduling Distributed Data-Intensive Applications on Global Grids , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[16]  Rajkumar Buyya,et al.  SLA-Based Cooperative Superscheduling Algorithms for Computational Grids , 2006 .

[17]  Qi Zhang,et al.  Load Unbalancing to Improve Performance under Autocorrelated Traffic , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).

[18]  Hui Li,et al.  Towards A Better Understanding of Workload Dynamics on Data-Intensive Clusters and Grids , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[19]  Richard Heusdens,et al.  Analysis and Synthesis of Pseudo-Periodic Job Arrivals in Grids: A Matching Pursuit Approach , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[20]  Michael Muskulus,et al.  Modeling correlated workloads by combining model based clustering and a localized sampling algorithm , 2007, ICS '07.

[21]  Hui Li Long Range Dependent Job Arrival Process and Its Implications in Grid Environments , 2007 .

[22]  Carl Kesselman,et al.  A provisioning model and its comparison with best-effort for performance-cost optimization in grids , 2007, HPDC '07.

[23]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[24]  Michael Muskulus,et al.  Analysis and modeling of job arrivals in a production grid , 2007, PERV.