In recent years our society has witnessed an unprecedented growth in computing power available to tackle important problems in science, engineering and medicine. For example, the SHARCNET network links large computing resources in 11 leading academic institutions in South Central Ontario, thus providing access to thousands of compute processors. It is a continuous challenge to develop efficient and scalable algorithms and methods for solving large scientific and engineering problems on such parallel and distributed computers. If the computing power available in such computational grids can be unleashed effectively in a scalable way, large scientific problems can be solved that would otherwise be hard to solve using the machines available in a stand-alone way. This paper describes techniques and software developed that allow to apply the power of computational grids to large-scale, loosely coupled parallel bioinformatics problems. Our approach is based on decentralization and implemented in Java, leading to a flexible, portable and scalable software solution for parallel bioinformatics. We discuss advantages and disadvantages of this approach, and demonstrate seamless performance on an ad-hoc grid composed of a wide variety of hardware for a real-life parallel bioinformatics problem. The bioinformatics problem described consists of virtual experiments in RNA folding executed on hundreds of compute processors concurrently, which may establish one of the missing links in the chain events that led to the origin of life.
[1]
Ian Foster,et al.
The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition
,
1998,
The Grid 2, 2nd Edition.
[2]
Ulrich Rüde,et al.
A lightweight Java taskspaces framework for scientific computing on computational grids
,
2003,
SAC '03.
[3]
Rob S. Markel,et al.
Abundance of correctly folded RNA motifs in sequence space, calculated on computational grids
,
2005,
Nucleic acids research.
[4]
Ami Marowka,et al.
The GRID: Blueprint for a New Computing Infrastructure
,
2000,
Parallel Distributed Comput. Pract..
[5]
Gregory R. Andrews,et al.
Foundations of Multithreaded, Parallel, and Distributed Programming
,
1999
.
[6]
Walter Fontana,et al.
Fast folding and comparison of RNA secondary structures
,
1994
.
[7]
Rob Knight,et al.
Finding specific RNA motifs: function in a zeptomole world?
,
2003,
RNA.
[8]
David Gelernter,et al.
Generative communication in Linda
,
1985,
TOPL.
[9]
Hans De Sterck,et al.
TaskSpaces: A Software Framework for Parallel Bioinformatics on Computational Grids
,
2005
.