Performance‐based middleware for Grid computing

This paper describes a stateful service‐oriented middleware infrastructure for the management of scientific tasks running on multi‐domain heterogeneous distributed architectures. Allocating scientific workload across multiple administrative boundaries is a key issue in Grid computing and as a result a number of supporting services including match‐making, scheduling and staging have been developed. Each of these services allows the scientist to utilize the available resources, although a sustainable level of service in such shared environments cannot always be guaranteed. A performance‐based middleware infrastructure is described in which prediction data for each scientific task are calculated, stored and published through a Globus‐based performance information service. Distributing these data allows additional performance‐based middleware services to be built, two of which are described in this paper: an intra‐domain predictive co‐scheduler and a multi‐domain workload steering system. These additional facilities significantly improve the ability of the system to meet task deadlines, as well as enhancing inter‐domain load‐balancing and system‐wide resource utilization. Copyright © 2005 John Wiley & Sons, Ltd.

[1]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[2]  Ian T. Foster,et al.  Grid Services for Distributed System Integration , 2002, Computer.

[3]  Graham R. Nudd,et al.  High Performance Service Discovery in Large-Scale Multi-Agent and Mobile-Agent Systems , 2001, Int. J. Softw. Eng. Knowl. Eng..

[4]  Warren Smith,et al.  An Evaluation of Alternative Designs for a Grid Information Service , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[5]  Stephen A. Jarvis,et al.  Performance evaluation of a grid resource monitoring and discovery service , 2003, IEE Proc. Softw..

[6]  Jeffrey S. Vetter,et al.  Autopilot: adaptive control of distributed applications , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[7]  Graham R. Nudd,et al.  Performance modeling of parallel and distributed computing using PACE , 2000, Conference Proceedings of the 2000 IEEE International Performance, Computing, and Communications Conference (Cat. No.00CH37086).

[8]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[9]  Lingyun Yang,et al.  Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[10]  Jennifer M. Schopf,et al.  A performance study of monitoring and information services for distributed systems , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[11]  Vipin Kumar,et al.  Information power grid: The new frontier in parallel computing? , 1999, IEEE Concurr..

[12]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[13]  Scott R. Kohn,et al.  Toward a Common Component Architecture for High-Performance Scientific Computing , 1999, HPDC.

[14]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[15]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[16]  Graham R. Nudd,et al.  Optimisation of application execution on dynamic systems , 2001, Future Gener. Comput. Syst..

[17]  Stephen A. Jarvis,et al.  Performance-based middleware services for grid computing , 2003, 2003 Autonomic Computing Workshop.

[18]  Graham R. Nudd,et al.  Pace—A Toolset for the Performance Prediction of Parallel and Distributed Systems , 2000, Int. J. High Perform. Comput. Appl..

[19]  Subhash Saini,et al.  Local grid scheduling techniques using performance prediction , 2003 .

[20]  David Abramson,et al.  Economic models for resource management and scheduling in Grid computing , 2002, Concurr. Comput. Pract. Exp..

[21]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[22]  Francine Berman,et al.  Using Stochastic Information to Predict Application Behavior on Contended Resources , 2001, Int. J. Found. Comput. Sci..

[23]  Fabrizio Petrini,et al.  Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[24]  Subhash Saini,et al.  ARMS: An agent-based resource management system for grid computing , 2002, Sci. Program..

[25]  Graham R. Nudd,et al.  Performance optimization of financial option calculations , 2000, Parallel Comput..