Stateful Grid Resource Selection for Related Asynchronous Tasks

In today’s grid deployments, resource selection is based on the prior knowledge of the performance characteristics of the application on a particular resource and on real-time monitoring status of the resource such as load on the system, network bandwidth, etc. Any lag between a resource selection decision and the time the job appears in the system’s monitoring facility will cause subsequent decisions to be based on incorrect information. If two or more jobs arrive within this hysteresis window, the incorrect assessment of system state can have negative consequences on job response time and system throughput. In this paper we describe a stateful resource selection protocol we designed to mitigate this problem for a real time storm surge modeling project. We present results from real experiments on a regional grid. We use emulation to compare and study the effect of our protocol under varying load conditions. Based on our evaluation we argue that the enhanced protocol should be made available as a globally-aware grid resource selection service.

[1]  Morris A. Jette Performance Characteristics of Gang Scheduling in Multiprogrammed Environments , 1997, SC.

[2]  Ken Kennedy,et al.  Scheduling strategies for mapping application workflows onto the grid , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[3]  David Abramson,et al.  Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[4]  H. Casanova,et al.  Improving Grid Resource Allocation via Integrated Selection and Binding , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[5]  L. Ramakrishnan,et al.  Toward a Doctrine of Containment: Grid Hosting with Adaptive Resource Control , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[6]  Gregor von Laszewski,et al.  A Java commodity grid kit , 2001, Concurr. Comput. Pract. Exp..

[7]  Francine Berman,et al.  Toward a framework for preparing and executing adaptive grid programs , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[8]  Sathish S. Vadhiyar,et al.  A metascheduler for the Grid , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[9]  Rajkumar Buyya,et al.  A Grid service broker for scheduling e-Science applications on global data Grids: Research Articles , 2006 .

[10]  Lavanya Ramakrishnan,et al.  Real-time storm surge ensemble modeling in a grid environment , 2006 .

[11]  Jarek Nabrzyski,et al.  Grid Resource Management , 2004 .

[12]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[13]  Miron Livny,et al.  Experience with the Condor distributed batch system , 1990, IEEE Workshop on Experimental Distributed Systems.

[14]  José Luis Vázquez-Poletti,et al.  Coordinated harnessing of the IRISGrid and EGEE testbeds with GridWay , 2006, J. Parallel Distributed Comput..

[15]  Michael D. Abràmoff,et al.  Image processing with ImageJ , 2004 .

[16]  Andrew A. Chien,et al.  Efficient resource description and high quality selection for virtual grids , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[17]  Ken Kennedy,et al.  TaskScheduling Strategies forWorkflow-based Applications inGrids , 2005 .

[18]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[19]  R S Willard,et al.  Information services. , 1982, Science.

[20]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..