Mining performance data for metascheduling decision support in the Grid

Metaschedulers in the Grid need dynamic information to support their scheduling decisions. Job response time on computing resources, for instance, is such a performance metric. In this paper, we propose an Instance Based Learning technique to predict response times by mining historical performance data. The novelty of our approach is to introduce policy attributes in representing and comparing resource states, which are defined as the pools of running and queued jobs on the resources at the time of making predictions. The policy attributes reflect the local scheduling policies and they can be automatically discovered using genetic search. An extensive empirical evaluation is conducted to validate our technique using real workload traces, which are collected from the NIKHEF production cluster on the LHC Computing Grid and Blue Horizon in the San Diego Supercomputer Center (SDSC). The experimental results show that acceptable prediction accuracy can be achieved, where the normalized average prediction errors for response times are ranging from 0.57 to 0.79.

[1]  CaoJunwei,et al.  Grid load balancing using intelligent agents , 2005 .

[2]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[3]  Rajkumar Buyya,et al.  A taxonomy and survey of grid resource management systems for distributed computing , 2002, Softw. Pract. Exp..

[4]  Stephen A. Jarvis,et al.  Grid load balancing using intelligent agents , 2005, Future Gener. Comput. Syst..

[5]  Carla E. Brodley,et al.  Predictive application-performance modeling in a computational grid environment , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[6]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[7]  Jarek Nabrzyski,et al.  Multicriteria aspects of Grid resource management , 2004 .

[8]  Valerie Taylor,et al.  Resource management in metacomputing environments (parallel computing) , 1999 .

[9]  Warren Smith,et al.  Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance , 1999, JSSPP.

[10]  Francine Berman,et al.  Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[11]  Warren Smith,et al.  Resource Selection Using Execution and Queue Wait Time Predictions , 2002 .

[12]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[13]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[14]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[15]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[16]  Dror G. Feitelson,et al.  Workload Modeling for Performance Evaluation , 2002, Performance.

[17]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[18]  Jarek Nabrzyski,et al.  Grid resource management: state of the art and future trends , 2004 .

[19]  Hui Li,et al.  Efficient response time predictions by exploiting application and resource state similarities , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[20]  Jennifer M. Schopf,et al.  Ten Actions When Grid Scheduling , 2004 .

[21]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[22]  David Abramson,et al.  A Computational Economy for Grid Computing and its Implementation in the Nimrod-G Resource Brok , 2001, Future Gener. Comput. Syst..

[23]  Ian T. Foster,et al.  DI-GRUBER: A Distributed Approach to Grid Resource Brokering , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[24]  David Abramson,et al.  Scheduling parameter sweep applications on global Grids: a deadline and budget constrained cost–time optimization algorithm , 2005, Softw. Pract. Exp..

[25]  Chuliang Weng,et al.  Heuristic scheduling for bag-of-tasks applications in combination with QoS in the computational grid , 2005, Future Gener. Comput. Syst..

[26]  Hui Li,et al.  Predicting job start times on clusters , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..