Multi-Criteria Grid Resource Management Using Performance Prediction Techniques

To date, many of existing Grid resource brokers make their decisions concerning selection of the best resources for computational jobs using basic resource parameters such as, for instance, load. This approach may often be insufficient. Estimations of job start and execution times are needed in order to make more adequate decisions and to provide better quality of service for end-users. Nevertheless, due to heterogeneity of Grids and often incomplete information available the results of performance prediction methods may be very inaccurate. Therefore, estimations of prediction errors should be also taken into consideration during a resource selection phase. We present in this paper the multi-criteria resource selection method based on estimations of job start and execution times, and prediction errors. To this end, we use GRMS [28] and GPRES tools. Tests have been conducted based on workload traces which were recorded from a parallel machine at UPC. These traces cover 3 years of job information as recorded by the LoadLeveler batch management systems. We show that the presented method can considerably improve the efficiency of resource selection decisions.

[1]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[2]  Francine Berman,et al.  Performance prediction in production environments , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[3]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[4]  Holly Dail,et al.  A Modular Framework for Adaptive Scheduling in Grid Application Development Environments , 2002 .

[5]  Warren Smith,et al.  A Resource Management Architecture for Metacomputing Systems , 1998, JSSPP.

[6]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[7]  Richard Wolski,et al.  Forecasting network performance to support dynamic scheduling using the network weather service , 1997, Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183).

[8]  Jarek Nabrzyski,et al.  Grid Resource Management , 2004 .

[9]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[10]  John E. Moody,et al.  Fast adaptive k-means clustering: some empirical results , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[11]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[12]  Jarek Nabrzyski,et al.  Multicriteria aspects of Grid resource management , 2004 .

[13]  Dror G. Feitelson,et al.  Utilization and Predictability in Scheduling the IBM SP2 with Backfilling , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[14]  Warren Smith,et al.  Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance , 1999, JSSPP.

[15]  Peter A. Dinda Online prediction of the running time of tasks , 2001, SIGMETRICS '01.

[16]  Francine Berman,et al.  Mapping Parallel Applications to Distributed Heterogeneous Systems , 1996 .

[17]  Richard Gibbons,et al.  A Historical Application Profiler for Use by Parallel Schedulers , 1997, JSSPP.

[18]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[19]  Jarek Nabrzyski,et al.  Dynamic grid scheduling with job migration and rescheduling in the GridLab resource management system , 2004, Sci. Program..

[20]  Chuang Liu,et al.  Design and evaluation of a resource selection framework for Grid applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[21]  Allen B. Downey Predicting queue times on space-sharing parallel computers , 1997, Proceedings 11th International Parallel Processing Symposium.

[22]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[23]  David Abramson,et al.  A Computational Economy for Grid Computing and its Implementation in the Nimrod-G Resource Brok , 2001, Future Gener. Comput. Syst..

[24]  Xin Li,et al.  Prophesy: automating the modeling process , 2001, Proceedings Third Annual International Workshop on Active Middleware Services.

[25]  Dieter Kranzlmüller Scheduling and load balancing , 2003 .

[26]  Ali R. Hurson,et al.  Scheduling and Load Balancing in Parallel and Distributed Systems , 1995 .

[27]  Richard Wolski,et al.  Predicting the CPU availability of time‐shared Unix systems on the computational grid , 2004, Cluster Computing.