论文信息 - ANDY: a general, fault-tolerant tool for database searching on computer clusters

ANDY: a general, fault-tolerant tool for database searching on computer clusters

SUMMARY ANDY (seArch coordination aND analYsis) is a set of Perl programs and modules for distributing large biological database searches, and in general any sequence of commands, across the nodes of a Linux computer cluster. ANDY is compatible with several commonly used distributed resource management (DRM) systems, and it can be easily extended to new DRMs. A distinctive feature of ANDY is the choice of either dedicated or fair-use operation: ANDY is almost as efficient as single-purpose tools that require a dedicated cluster, but it runs on a general-purpose cluster along with any other jobs scheduled by a DRM. Other features include communication through named pipes for performance, flexible customizable routines for error-checking and summarizing results, and multiple fault-tolerance mechanisms. AVAILABILITY ANDY is freely available and can be obtained from http://compbio.berkeley.edu/proj/andy. SUPPLEMENTARY INFORMATION Supplemental data, figures, and a more detailed overview of the software are found at http://compbio.berkeley.edu/proj/andy.

Andrew Smith | Steven E. Brenner | John-Marc Chandonia

[1] Robert D. Bjornson,et al. TurboBLAST : a parallel implementation of blast built on the turbohub , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[2] Patrice Koehl,et al. The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[3] Robert D. Finn,et al. The Pfam protein families database , 2004, Nucleic Acids Res..

[4] Denis C. Shields,et al. Wrapping up BLAST and other applications for use on Unix clusters , 2003, Bioinform..

[5] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.

[6] Raphaël Clifford,et al. Disperse: a simple and efficient approach to parallel database searching , 2000, Bioinform..

[7] Mark Gerstein,et al. The protein target list of the Northeast Structural Genomics Consortium , 2004, Proteins.