BLAST is probably the most used application in bioinformatics teams. BLAST complexity tends to be a concern when the query sequence sets and reference databases are large. Here we present BGBlast: an approach for handling the computational complexity of large BLAST executions by porting BLAST to the Grid platform, leveraging the power of the thousands of CPUs which compose the EGEE infrastructure. BGBlast provides innovative features for efficiently managing BLAST databases in the distributed Grid environment. The system (1) keeps the databases constantly up to date while still allowing the user to regress to earlier versions, (2) stores the older versions of databases on the Grid with a time and space efficient delta encoding and (3) manages the number of replicas for each database over the Grid with an adaptive algorithm, dynamically balancing between execution parallelism and storage costs.
[1]
Bin Ma,et al.
PatternHunter: faster and more sensitive homology search
,
2002,
Bioinform..
[2]
W. J. Kent,et al.
BLAT--the BLAST-like alignment tool.
,
2002,
Genome research.
[3]
김삼묘,et al.
“Bioinformatics” 특집을 내면서
,
2000
.
[4]
Wu-chun Feng,et al.
The design, implementation, and evaluation of mpiBLAST
,
2003
.
[5]
Ryo Umetsu,et al.
Scalable BLAST Service in OBIGrid Environment
,
2003
.
[6]
M S Waterman,et al.
Identification of common molecular subsequences.
,
1981,
Journal of molecular biology.
[7]
Davy Virdee,et al.
A GT 3 based BLAST grid service for biomedical research
,
2004
.
[8]
Lukas Wagner,et al.
A Greedy Algorithm for Aligning DNA Sequences
,
2000,
J. Comput. Biol..