Resource Evaluation and Node Monitoring in Service Oriented Ad-hoc Grids

Ad-hoc grid computing is an emerging computing technology that promises to deliver high performance at relatively low cost using existing computing resources. There are a number of grid middleware systems being developed to this end. However, a number of features are lacking that are required if ad hoc grid computing is to become viable in a production environment. This paper addresses two of these key features -- a resource evaluation and allocation system, which allows grid developers to accurately specify the requirements of their grid job to ensure the most suitable nodes are used when creating the ad-hoc grid, and a node monitoring and error recovery system, which allows grid applications to detect and recover from errors and complete successfully. These systems are built into Mage, the Marburg Ad-hoc Grid Environment, a grid middleware solution developed using the Globus Toolkit, Apache Tomcat and FreePastry.

[1]  Thomas Friese,et al.  Towards a service-oriented ad hoc grid , 2004, Third International Symposium on Parallel and Distributed Computing/Third International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks.

[2]  David Abramson,et al.  Nimrod: a tool for performing parametrised simulations using distributed workstations , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[3]  Lingyun Yang,et al.  Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[4]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.