Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid

The availability of powerful microprocessors and high-speed networks as commodity components has enabled high-performance computing on distributed systems (wide-area cluster computing). In this environment, as the resources are usually distributed geographically at various levels (department, enterprise or worldwide), there is a great challenge in integrating, coordinating and presenting them as a single resource to the user, thus forming a computational grid. Another challenge comes from the distributed ownership of resources, with each resource having its own access policy, cost and mechanism. The proposed Nimrod/G grid-enabled resource management and scheduling system builds on our earlier work on Nimrod (D. Abramson et al., 1994, 1995, 1997, 2000) and follows a modular and component-based architecture enabling extensibility, portability, ease of development, and interoperability of independently developed components. It uses the GUSTO (GlobUS TOolkit) services and can be easily extended to operate with any other emerging grid middleware services. It focuses on the management and scheduling of computations over dynamic resources scattered geographically across the Internet at department, enterprise or global levels, with particular emphasis on developing scheduling schemes based on the concept of computational economy for a real testbed, namely the Globus testbed (GUSTO).

[1]  R. Sosič,et al.  The Nimrod computational workbench: a case study in desktop metacomputing , 1996 .

[2]  Henri Casanova,et al.  Netsolve: a Network-Enabled Server for Solving Computational Science Problems , 1997, Int. J. High Perform. Comput. Appl..

[3]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[4]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[5]  David Abramson,et al.  High performance parametric modeling with Nimrod/G: killer application for the global grid? , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[6]  Henri Casanova,et al.  Adaptive Scheduling for Task Farming with Grid Middleware , 1999, Int. J. High Perform. Comput. Appl..

[7]  Henri Casanova,et al.  NetSovle: A Network Server for Solving Computational Science Problems , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[8]  Rok Sosic,et al.  The Laboratory Bench: Distributed Computing for Parametised Simulations , 1994 .

[9]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[10]  Francine Berman,et al.  The AppLeS Project: A Status Report , 1997 .

[11]  Craig J. Patten,et al.  DISCWorld: an environment for service-based matacomputing , 1999, Future Gener. Comput. Syst..

[12]  R. Sosi,et al.  Tool-based Parameterisation : An Application Perspective , 1995 .

[13]  Henri Casanova,et al.  Adaptive Scheduling for Task Farming with Grid Middleware , 1999, Euro-Par.

[14]  David Abramson,et al.  Nimrod: a tool for performing parametrised simulations using distributed workstations , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[15]  Vipin Kumar,et al.  Information power grid: The new frontier in parallel computing? , 1999, IEEE Concurr..

[16]  Rajkumar Buyya,et al.  High Performance Cluster Computing: Architectures and Systems , 1999 .