Data Aggregation based Adaptive Long term load Prediction mechanism in Grid environment

In recent years, as a popular technique to support CSCW, Grid computing is becoming more and more attractive. Hereinto, as the CPU load information can guide task scheduling process greatly, the long-term CPU load prediction becomes a very hot research field and has been widely studied. However, as the prediction errors will be accumulated gradually and meanwhile the relevant parameters' optimal values may change dynamically with the variance of load series, the previous prediction algorithms usually can not obtain good prediction accuracy when the length of prediction interval is quite large. To address these feature, a Data Aggregation based Adaptive Long term load Prediction mechanism called DA2LP is proposed in this paper. Therein, in order to reduce the number of prediction step and increase the amount of useful input load information, the data aggregation concept is introduced to integrate with AR model. Meanwhile, with the observation and analysis of the relevant parameters' impact on prediction accuracy in our prediction model, an adaptive parameter selection mechanism is proposed, where the optimal relevant parameters can be adapted automatically to enhance prediction accuracy during the prediction process. The experiments show that our proposed mechanism can outperform significantly the previous prediction methods in mean square error (MSE) for long term load prediction.

[1]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[2]  Kjeld Schmidt,et al.  Taking CSCW seriously , 1992, Computer Supported Cooperative Work (CSCW).

[3]  Wei Sun,et al.  CPU Load Predictions on the Computational Grid * , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[4]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[5]  Selim G. Akl,et al.  Scheduling Algorithms for Grid Computing: State of the Art and Open Problems , 2006 .

[6]  Guangwen Yang,et al.  Load prediction using hybrid model for computational grid , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[7]  Ian T. Foster,et al.  Homeostatic and tendency-based CPU load predictions , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[8]  Guangwen Yang,et al.  Adaptive Hybrid Model for Long Term Load Prediction in Computational Grid , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[9]  Peter A. Dinda,et al.  The statistical properties of host load , 1999, Sci. Program..

[10]  Wei Sun,et al.  Predicting Running Time of Grid Tasks based on CPU Load Predictions , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[11]  Peter A. Dinda,et al.  A prediction-based real-time scheduling advisor , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[12]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[13]  Lingyun Yang,et al.  Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[14]  Amaury Lendasse,et al.  Methodology for long-term prediction of time series , 2007, Neurocomputing.

[15]  Peter A. Dinda,et al.  Host load prediction using linear models , 2000, Cluster Computing.