Online Prediction of the Running Time of Tasks

We describe and evaluate the Running Time Advisor (RTA), a system that can predict the running time of a compute-bound task on a typical shared, unreserved commodity host. The prediction is computed from linear time series predictions of host load and takes the form of a confidence interval that neatly expresses the error associated with the measurement and prediction processes – error that must be captured to make statistically valid decisions based on the predictions. Adaptive applications make such decisions in pursuit of consistent high performance, choosing, for example, the host where a task is most likely to meet its deadline. We begin by describing the system and summarizing the results of our previously published work on host load prediction. We then describe our algorithm for computing predictions of running time from host load predictions. We next evaluate the system using over 100,000 randomized testcases run on 39 different hosts, finding that is indeed capable of computing correct and useful confidence intervals. Finally, we report on our experience with using the RTA in application-oriented real-time scheduling in distributed systems.

[1]  Peter A. Dinda,et al.  Host load prediction using linear models , 2000, Cluster Computing.

[2]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[3]  Edward D. Lazowska,et al.  The limited performance benefits of migrating active processes for load sharing , 1988, SIGMETRICS '88.

[4]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[5]  R. Wolski,et al.  Predicting the CPU availability of time‐shared Unix systems on the computational grid , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[6]  Dean Sutherland,et al.  A resource query interface for network-aware applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[7]  Andrea C. Arpaci-Dusseau,et al.  Effective distributed scheduling of parallel workloads , 1996, SIGMETRICS '96.

[8]  Peter A. Dinda,et al.  Resource Signal Prediction and Its Application to Real-time Scheduling Advisors (Thesis Summary) , 2000 .

[9]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[10]  Peter Steenkiste,et al.  Automatic generation of parallel programs with dynamic load balancing , 1994, Proceedings of 3rd IEEE International Symposium on High Performance Distributed Computing.

[11]  Peter A. Dinda,et al.  Preliminary Report on the Design of a Framework for Distributed Visualization , 1999, PDPTA.

[12]  James F. Kurose,et al.  Load Sharing in Soft Real-Time Distributed Computer Systems , 1987, IEEE Transactions on Computers.

[13]  Monica S. Lam,et al.  Jade: a high-level, machine-independent language for parallel programming , 1993, Computer.

[14]  Richard Wolski,et al.  Forecasting network performance to support dynamic scheduling using the network weather service , 1997, Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183).

[15]  Peter A. Dinda,et al.  Realistic CPU Workloads through Host Load Trace Playback , 2000, LCR.

[16]  J.M. Schopf,et al.  Stochastic Scheduling , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[17]  John A. Zinky,et al.  Architectural Support for Quality of Service for CORBA Objects , 1997, Theory Pract. Object Syst..

[18]  Peter A. Dinda,et al.  The Case for Prediction-Based Best-Effort Real-Time Systems , 1999, IPPS/SPDP Workshops.

[19]  A. Watson,et al.  OMG (Object Management Group) architecture and CORBA (common object request broker architecture) specification , 2002 .

[20]  Krithi Ramamritham,et al.  Distributed Scheduling of Tasks with Deadlines and Resource Requirements , 1989, IEEE Trans. Computers.

[21]  Krithi Ramamritham,et al.  The Spring System: Integrated Support for Complex Real-Time Systems , 1999, Real-Time Systems.

[22]  Thomas R. Gross,et al.  ReMoS: A Resource Monitoring System for Network-Aware Applications , 1997 .

[23]  Peter A. Dinda,et al.  An Extensible Toolkit for Resource Prediction In Distributed Systems , 1999 .

[24]  Francine Berman,et al.  Scheduling from the perspective of the application , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[25]  Miron Livny,et al.  The Available Capacity of a Privately Owned Workstation Environmont , 1991, Perform. Evaluation.

[26]  Peter A. Dinda,et al.  The statistical properties of host load , 1999, Sci. Program..

[27]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[28]  Teunis J. Ott,et al.  Load-balancing heuristics and process behavior , 1986, SIGMETRICS '86/PERFORMANCE '86.