Scheduling in Heterogeneous Grid Environments: The Effects of DataMigration

Computational grids have the potential for solving large-scale scientific problems using heterogeneous and geographically distributed resources. However, a number of major technical hurdles must be overcome before this goal can be fully realized. One problem critical to the effective utilization of computational grids is efficient job scheduling. Our prior work addressed this challenge by defining a grid scheduling architecture and several job migration strategies. The focus of this study is to explore the impact of data migration under a variety of demanding grid conditions. We evaluate our grid scheduling algorithms by simulating compute servers, various groupings of servers into sites, and inter-server networks, using real workloads obtained from leading supercomputing centers. Several key performance metrics are used to compare the behavior of our algorithms against reference local and centralized scheduling schemes. Results show the tremendous benefits of grid scheduling, even in the presence of input/output data migration - while highlighting the importance of utilizing communication-aware scheduling schemes.

[1]  Kavitha Ranganathan,et al.  Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[2]  Francisco Vilar Brasileiro,et al.  Exploiting Replication and Data Reuse to Efficiently Schedule Data-Intensive Applications on Grids , 2004, JSSPP.

[3]  Uwe Schwiegelshohn,et al.  On Advantages of Grid Computing for Parallel Job Scheduling , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[4]  Hongzhang Shan,et al.  Job Superscheduler Architecture and Performance in Computational Grid Environments , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[5]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[6]  Ramin Yahyapour,et al.  Design and evaluation of job scheduling strategies for grid computing , 2000, GRID.

[7]  Sajal K. Das,et al.  A de-centralized scheduling and load balancing algorithm for heterogeneous grid environments , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[8]  Francine Berman,et al.  Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[9]  P. Sadayappan,et al.  Distributed job scheduling on computational Grids using multiple simultaneous requests , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[10]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).