Design and implementation of a computational grid for bioinformatics

The popular technologies, Internet computing and grid technologies promise to change the way we tackle complex problems. They enable large-scale aggregation and sharing of computational, data and other resources across institutional boundaries. And harnessing these new technologies effectively transforms scientific disciplines ranging from high-energy physics to the life sciences. The computational analysis of biological sequences is a kind of computation driven science. Cause the biology data growing quickly and these databases are heterogeneous. We can use the grid system sharing and integrating the heterogeneous biology database. As we know, bioinformatics tools can speed up analysis the large-scale sequence data, especially about sequence alignment and analysis. The FASTA is a tool for aligning multiple protein or nucleotide sequences. These two bioinformatics software, which we used is a distributed and parallel version. The software uses a message-passing library called MPI (message passing interface) and runs on distributed workstation clusters as well as on traditional parallel computers. A grid computing environment is proposed and constructed on multiple Linux PC clusters by using globus toolkit (GT) and SUN grid engine (SGE). The experimental results and performances of the bioinformatics tool using on grid system are also presented.

[1]  Kuo-Bin Li,et al.  ClustalW-MPI: ClustalW analysis using distributed and parallel computing , 2003, Bioinform..

[2]  Denis C. Shields,et al.  Wrapping up BLAST and other applications for use on Unix clusters , 2003, Bioinform..

[3]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[4]  Martin Vingron,et al.  TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing , 2002, Bioinform..

[5]  Oswaldo Trelles,et al.  On the Parallelization of Bioinformatic Applications , 2001 .

[6]  Jon B. Weissman,et al.  Applying Grid technologies to bioinformatics , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[7]  Victor Alessandrini,et al.  BioGRID - An European grid for molecular biology , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[8]  Ian Foster,et al.  The Grid: A New Infrastructure for 21st Century Science , 2002 .