A general self-adaptive task scheduling system for non-dedicated heterogeneous computing

The efforts to construct a national scale grid computing environment has brought unprecedented computing capacity. Exploiting this complex infrastructure requires efficient middleware to support the execution of a distributed application, composed of a set of subtasks, for best performance. This presents the challenge how to schedule these subtasks in shared heterogeneous systems. Current work has several limitations. Most scheduling systems are based on determined estimation of task completion time. Current application-level scheduling algorithms are too closely coupled with application internal structures. The application performance may suffer when some resources represent an abnormal usage pattern during applications execution. To address these issues, we develop a prototype of grid harvest service (GHS) to provide dynamic and self-adaptive task scheduling. Experimental results show GHS outperforms current systems in scheduling large applications in a non-dedicated heterogeneous environment.

[1]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[2]  Xian-He Sun,et al.  Performance Modeling and Prediction of Nondedicated Network Computing , 2002, IEEE Trans. Computers.

[3]  Baruch Awerbuch,et al.  An Opportunity Cost Approach for Job Assignment in a Scalable Computing Cluster , 2000, IEEE Trans. Parallel Distributed Syst..

[4]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[5]  Allen B. Downey Using pathchar to estimate Internet link characteristics , 1999, SIGCOMM '99.

[6]  Ian T. Foster,et al.  On Death, Taxes, and the Convergence of Peer-to-Peer and Grid Computing , 2003, IPTPS.

[7]  Joel H. Saltz,et al.  The utility of exploiting idle workstations for parallel computation , 1997, SIGMETRICS '97.

[8]  Mor Harchol-Balter,et al.  Exploiting process lifetime distributions for dynamic load balancing , 1996, SIGMETRICS '96.

[9]  Amin Vahdat,et al.  The Interaction of Parallel and Sequential Workloads on a Network of Machines , 1995 .

[10]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[11]  Henri Casanova,et al.  Netsolve: a Network-Enabled Server for Solving Computational Science Problems , 1997, Int. J. High Perform. Comput. Appl..

[12]  Francine Berman,et al.  The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[13]  Francine Berman,et al.  Adaptive Computing on the Grid Using AppLeS , 2003, IEEE Trans. Parallel Distributed Syst..

[14]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[15]  Francine Berman,et al.  A Decoupled Scheduling Approach for the GrADS Program Development Environment , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[16]  Henri Casanova,et al.  Adaptive Scheduling for Task Farming with Grid Middleware , 1999, Euro-Par.

[17]  Yan Alexander Li,et al.  Estimating the execution time distribution for a task graph in a heterogeneous computing system , 1997, Proceedings Sixth Heterogeneous Computing Workshop (HCW'97).

[18]  David Abramson,et al.  Nimrod: a tool for performing parametrised simulations using distributed workstations , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[19]  Baruch Awerbuch,et al.  An Opportunity Cost Approach for Job Assignment and Reassignment in a Scalable Computing Cluster , 2002 .

[20]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[21]  Ming Wu,et al.  Grid Harvest Service: a system for long-term, application-level task scheduling , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[22]  Henri Casanova,et al.  Adaptive Scheduling for Task Farming with Grid Middleware , 1999, Int. J. High Perform. Comput. Appl..

[23]  Miron Livny,et al.  The Available Capacity of a Privately Owned Workstation Environmont , 1991, Perform. Evaluation.

[24]  John F. Karpovich,et al.  The Legion Resource Management System , 1999, JSSPP.

[25]  Henri Casanova,et al.  UMR: a multi-round algorithm for scheduling divisible workloads , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[26]  Francine Berman,et al.  Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[27]  Ian Foster,et al.  A quality of service architecture that combines resource reservation and application adaptation , 2000, 2000 Eighth International Workshop on Quality of Service. IWQoS 2000 (Cat. No.00EX400).

[28]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..