论文信息 - Evaluation of Meta-scheduler Architectures and Task Assignment Policies for High Throughput Computing

Evaluation of Meta-scheduler Architectures and Task Assignment Policies for High Throughput Computing

In this paper we present a model and simulator for many clusters of heterogeneous PCs belonging to a local network. These clusters are assumed to be connected to each other through a global network and each cluster is managed via a local scheduler which is shared by many users. We validate our simulator by comparing the experimental and analytical results of a M/M/4 queuing system. These studies indicate that the simulator is consistent. Next, we do the comparison with a real batch system and we obtain an average error of 10.5% for the response time and 12% for the makespan. We conclude that the simulator is realistic and well describes the behaviour of a large-scale system. Thus we can study the scheduling of our system in a high throughput context. We justify our decentralized, adaptive and opportunistic approach in comparison to a centralized approach in such a context.

[1] Abhijit Bose,et al. MARS: a metascheduler for distributed resources in campus grids , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[2] Ramin Yahyapour,et al. Design and evaluation of job scheduling strategies for grid computing , 2000, GRID.

[3] Dong Lu,et al. Synthesizing Realistic Computational Grids , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[4] Peter A. Dinda,et al. Synthesizing Realistic Computational Grids , 2003, SC.

[5] Ian Stokes-Rees,et al. DIRAC: a scalable lightweight architecture for high throughput computing , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[6] Sathish S. Vadhiyar,et al. A metascheduler for the Grid , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[7] Francine Berman,et al. Using Effective Network Views to Promote Distributed Application Performance , 1999, PDPTA.

[8] Leonard Kleinrock,et al. Queueing Systems: Volume I-Theory , 1975 .

[9] Miron Livny,et al. Mechanisms for High Throughput Computing , 1997 .

[10] Hui Li,et al. Workload Characteristics of a Multi-cluster Supercomputer , 2004, JSSPP.

[11] Matthew Doar,et al. A better model for generating test networks , 1996, Proceedings of GLOBECOM'96. 1996 IEEE Global Telecommunications Conference.

[12] Stephen A. Jarvis,et al. Optimising static workload allocation in multiclusters , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[13] Henri Casanova,et al. Simgrid: a toolkit for the simulation of application scheduling , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[14] Francine Berman,et al. A study of deadline scheduling for client-server systems on the Computational Grid , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[15] Ian Foster,et al. The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[16] Ian T. Foster,et al. The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[17] Ami Marowka,et al. The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[18] G. Kuznetsov,et al. Results of the LHCb experiment Data Challenge 2004 , 2004 .