论文信息 - Job co-allocation strategies for multiple high performance computing clusters

Job co-allocation strategies for multiple high performance computing clusters

To more effectively use a network of high performance computing clusters, allocating multi-process jobs across multiple connected clusters becomes an attractive possibility. This allocation process entails dividing the processes of a job among several clusters, which we refer to as co-allocation. Co-allocation offers the possibility of more efficient use of computer resources, reduced turn-around time and computations using numbers of processes larger than processes on any single cluster. In order to realize these possibilities, effective co-allocation, ultimately, depends on the inter-cluster communication cost. In this paper, we introduce a scalable co-allocation strategy called the Maximum Bandwidth Adjacent cluster Set (MBAS) strategy. The strategy makes use of two thresholds to control allocation: one to control the limit on bandwidth on usable inter-cluster communication links and another to control how jobs are split. A simulator that can simulate the dynamic behavior of jobs running across multiple clusters was developed and used to examine the performance of the MBAS co-allocation strategy. Our results indicate that by adjusting the thresholds for link level control and chunk size control in splitting jobs, the MBAS co-allocation strategy can significantly improve both user satisfaction and system utilization.

Jinhui Qin | Michael Anthony Bauer | M. Bauer | J. Qin

[1] Jinhui Qin,et al. A Study on Job Co-Allocation in Multiple HPC Clusters , 2006, HPCS.

[2] Uwe Schwiegelshohn,et al. On Advantages of Grid Computing for Parallel Job Scheduling , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[3] Ladislau Bölöni,et al. A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[4] Keqin Li,et al. Job scheduling for grid computing on metacomputers , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[5] Rajkumar Buyya,et al. High Performance Cluster Computing , 1999 .

[6] Vipin Kumar,et al. Algorithms for Constraint-Satisfaction Problems: A Survey , 1992, AI Mag..

[7] Joachim Geiler,et al. Workflow-based Grid applications , 2006, Future Gener. Comput. Syst..

[8] Kuo-Chan Huang,et al. Performance Evaluation of Load Sharing Policies on Computing Grid , 2005, PDPTA.

[9] Ian Foster,et al. The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[10] Dror G. Feitelson,et al. Backfilling with Lookahead to Optimize the Performance of Parallel Job Scheduling , 2003, JSSPP.

[11] Oscar H. Ibarra,et al. Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[12] Achim Streit,et al. Scheduling in HPC Resource Management Systems: Queuing vs. Planning , 2003, JSSPP.

[13] Ata Elahi. Network Communications Technology , 2000 .

[14] Adam Arbree,et al. Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[15] Roman Barták,et al. Constraint Satisfaction for Planning and Scheduling , 2005 .

[16] Dror G. Feitelson,et al. Parallel Job Scheduling under Dynamic Workloads , 2003, JSSPP.

[17] Rajkumar Buyya,et al. High Performance Cluster Computing: Programming and Applications , 1999 .

[18] Rajkumar Buyya,et al. A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[19] Rajkumar Buyya,et al. Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms , 2006, Sci. Program..

[20] Anca I. D. Bucur,et al. A Measurement-Based Simulation Study of Processor Co-allocation in Multicluster Systems , 2003, JSSPP.

[21] Ramin Yahyapour,et al. Benefits of global grid computing for job scheduling , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[22] Jinhui Qin,et al. A Study on Job Co-Allocation in Multiple HPC Clusters , 2005, 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment (HPCS'06).

[23] Ami Marowka,et al. The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[24] Daniel C. Stanzione,et al. Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters , 2005, The Journal of Supercomputing.

[25] Ioannis Vlahavas,et al. Intelligent techniques for planning , 2004 .

[26] Anca I. D. Bucur,et al. The Performance of Processor Co-Allocation in Multicluster Systems , 2003, CCGRID.

[27] Raj Jain,et al. The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.