Database-Driven Grid Computing with GridBASE

The GridBASE framework for database-driven grid computing is presented. The design and a prototype implementation of the framework is discussed. Industry-strength database technology plays a key role in the design of the framework. The database is used as a scalable, reliable and remotely accessible component both for storing and organizing the configuration information of the grid, and for managing information related to the grid users and the jobs and tasks they submit for execution. Other system components are worker nodes, a simple resource broker, a grid operator console, and application clients. In analogy with electrical power grids, a clear distinction is made in our design between the role played by grid users on the one hand, who develop and submit application code but are otherwise mostly isolated from resource deployment and selection, and the role played by the grid operator on the other hand, who is responsible for providing computing resources and assuring system availability and maintenance. Application code can be written in any language, and simple workflow support is provided. In our prototype implementation we experiment with code delivery and input and output file delivery via the database component. Our approach is based on decentralization and implemented in Java, leading to a lightweight, portable and scalable grid computing solution that is especially suited for parallel bioinformatics. Deployment of GridBASE on Ontario's SHARCNET and application to virtual experiments in RNA folding statistics are described.

[1]  John V. Carlis,et al.  Mastering data modeling , 2000 .

[2]  Rajkumar Buyya,et al.  A taxonomy of scientific workflow systems for grid computing , 2005, SGMD.

[3]  Hans De Sterck,et al.  TaskSpaces: A Software Framework for Parallel Bioinformatics on Computational Grids , 2005 .

[4]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[5]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[6]  Gregory R. Andrews,et al.  Foundations of Multithreaded, Parallel, and Distributed Programming , 1999 .

[7]  David P. Anderson,et al.  High-performance task distribution for volunteer computing , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[8]  Miron Livny,et al.  Condor: a distributed job scheduler , 2001 .

[9]  Michael S. Noble,et al.  Scientific Computation with JavaSpaces , 2001, HPCN Europe.

[10]  Rob S. Markel,et al.  Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids , 2005, Nucleic acids research.

[11]  Rajkumar Buyya,et al.  A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[12]  Ulrich Rüde,et al.  A lightweight Java taskspaces framework for scientific computing on computational grids , 2003, SAC '03.