Local learning has been proposed as a common framework to predict both application run times and queue wait times based on workload traces [8]. The queue wait time is shown to be more difficult and expensive to predict because its distance calculations typically involve not only job attributes but also resource states. In this paper methods and algorithms are investigated to improve prediction accuracy and prediction performance for queue wait times. Firstly, the so-called "local tuning" is adopted to tune parameters for each training subset divided by a pivot attribute (e.g., group or queue name). Bias-variance analysis of error is conducted on local tuning and its global counterparts - tuning parameters on the whole training set. A method is then developed to select tuning type adaptively based on the generalization error and bias-variance decomposition. Secondly, an efficient search tree structure called "M-Tree" is integrated into our algorithm to speed up k-nearest neighbor search. Experimental studies are conducted to evaluate the proposed methods and algorithms using real-world workload traces, which are collected from the NIKHEF production cluster on the LHC Computing Grid and Blue Horizon in the San Diego Supercomputer Center (SDSC). The results show that adaptive tuning can reduce the average prediction error by 3 to 10 percents compared to global tuning, and that the M-Tree nearest neighbor search is up to 8 times faster than the sequential search.
[1]
Pavel Zezula,et al.
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
,
1997,
VLDB.
[2]
Ricardo A. Baeza-Yates,et al.
Searching in metric spaces
,
2001,
CSUR.
[3]
Hui Li,et al.
Predicting job start times on clusters
,
2004,
IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..
[4]
Hui Li,et al.
Efficient response time predictions by exploiting application and resource state similarities
,
2005,
The 6th IEEE/ACM International Workshop on Grid Computing, 2005..
[5]
Carla E. Brodley,et al.
Predictive application-performance modeling in a computational grid environment
,
1999,
Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).
[6]
Elie Bienenstock,et al.
Neural Networks and the Bias/Variance Dilemma
,
1992,
Neural Computation.
[7]
Andrew W. Moore,et al.
Locally Weighted Learning
,
1997,
Artificial Intelligence Review.
[8]
Hui Li,et al.
Mining performance data for metascheduling decision support in the Grid
,
2007,
Future Gener. Comput. Syst..