Adaps - A three-phase adaptive prediction system for the run-time of jobs based on user behaviour

In heterogeneous and distributed environments it is necessary to create schedules for utilising resources in an efficient way. This generation often poses a problem for a scheduler, since several aspects have to be considered. One way of supporting a scheduler is to provide accurate predictions of the run-times of the submitted jobs. A large number of current techniques offer statistical models that are deployed on previously filtered data. As users have different jobs, and because the attributes of their jobs differ, filtering data and choosing an appropriate prediction method has to cover these aspects. This article describes Adaps, a system for run-time prediction that works in three phases. Each is independently adjusting to the jobs of a user, based on historical information. This leads to a user specific clustering of data and to a flexible utilisation of different prediction techniques in order to create a user-centred prediction model.

[1]  Jennifer M. Schopf,et al.  A General Architecture for Scheduling on the Grid , 2003 .

[2]  David Abramson,et al.  Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[3]  Jens Volkert,et al.  An Architecture for an Adaptive Run-time Prediction System , 2008, 2008 International Symposium on Parallel and Distributed Computing.

[4]  Mark Last Automated Detection of Outliers in Real-World Data , 2001 .

[5]  Jim Freeman,et al.  Outliers in Statistical Data (3rd edition) , 1995 .

[6]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[7]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[8]  Jarek Nabrzyski,et al.  Grid resource management: state of the art and future trends , 2004 .

[9]  Byoung-Dai Lee,et al.  Run-time prediction of parallel applications on shared environments , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[10]  Albert Y. Zomaya Handbook of Nature-Inspired and Innovative Computing - Integrating Classical Models with Emerging Technologies , 2006 .

[11]  Alexandru Iosup,et al.  Trace-based evaluation of job runtime and queue wait time predictions in grids , 2009, HPDC '09.

[12]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[13]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[14]  Robert D. van der Mei,et al.  A prediction method for job runtimes on shared processors: Survey, statistical analysis and new avenues , 2007, Perform. Evaluation.

[15]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[16]  J. Leake,et al.  APEL: An implementation of Grid accounting using R-GMA , 2005 .

[17]  Michael Stonebraker,et al.  The Morgan Kaufmann Series in Data Management Systems , 1999 .

[18]  Matthew S. Allen,et al.  Predicting Grid Resource Performance Online , 2006, Handbook of Nature-Inspired and Innovative Computing.

[19]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[20]  Richard Wolski,et al.  QBETS: queue bounds estimation from time series , 2007, SIGMETRICS '07.

[21]  Peter A. Dinda,et al.  Host load prediction using linear models , 2000, Cluster Computing.

[22]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[23]  Richard Wolski,et al.  Experiences with predicting resource performance on-line in computational grid settings , 2003, PERV.

[24]  Dan Tsafrir,et al.  Backfilling Using System-Generated Predictions Rather than User Runtime Estimates , 2007, IEEE Transactions on Parallel and Distributed Systems.

[25]  Emilio Luque,et al.  Software probes: towards a quick method for machine characterization and application performance prediction , 2008, 2008 International Symposium on Parallel and Distributed Computing.

[26]  Warren Smith,et al.  Improving resource selection and scheduling using predictions , 2004 .