Massively parallel solutions for molecular sequence analysis

In this paper we present new approaches to high performance protein database scanning on two novel massively parallel architectures to gain supercomputer power at low cost. The first architecture is built around a Beowulf PC-cluster linked by a high-speed network and fine-grained parallel Systola 1024 processor boards connected to each node. The second architecture is the Fuzion 150, a new parallel computer with a linear SIMD array of 1536 processing elements on a single chip. We present the design of a database scanning application based on the Smith-Waterman algorithm in order to derive efficient mappings onto these architectures. The implementations lead to significant runtime savings for large-scale database scanning. This result shows that both architectures provide high-throughput sequence similarity analysis solutions at a good price/performance ratio.

[1]  Wolfgang Straßer,et al.  Parallel volume rendering on a single-chip SIMD architecture , 2001, Proceedings IEEE 2001 Symposium on Parallel and Large-Data Visualization and Graphics (Cat. No.01EX520).

[2]  Mary Jane Irwin,et al.  A SIMD solution to the sequence comparison problem on the MGAP , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[3]  Daniel P. Lopresti,et al.  P-NAC: A Systolic Array for Comparing Nucleic Acid Sequences , 1987, Computer.

[4]  Hans-Werner Lang The instruction systolic array - a parallel architecture for VLSI , 1986, Integr..

[5]  Dominique Lavenier,et al.  Parallel Processing for Scanning Genomic Data-Bases , 1997, PARCO.

[6]  Dominique Lavenier,et al.  SAMBA: hardware accelerator for biological sequence comparison , 1997, Comput. Appl. Biosci..

[7]  W. Pearson Comparison of methods for searching protein sequence databases , 1995, Protein science : a publication of the Protein Society.

[8]  Stephen G. Tell,et al.  BioSCAN: a network sharable computational resource for searching biosequence databases , 1996, Comput. Appl. Biosci..

[9]  Eric Rice,et al.  The UCSC Kestrel General Purpose Parallel Processor , 1999, PDPTA.

[10]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[11]  Manfred Schimmler,et al.  The Instruction Systolic Array - Implementation of a Low-Cost Parallel Architecture as Add-On Board for Personal Computers , 1994, HPCN.

[12]  Richard Hughey,et al.  Parallel hardware for sequence comparison and alignment , 1996, Comput. Appl. Biosci..

[13]  Dzung T. Hoang,et al.  Searching genetic databases on Splash 2 , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[14]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[15]  Bertil Schmidt,et al.  A Parallel Accelerator Architecture for Multimedia Video Compression , 1999, Euro-Par.

[16]  Manfred Schimmler,et al.  Instruction systolic array in image processing applications , 1996, Other Conferences.

[17]  Bertil Schmidt,et al.  Design of a Parallel Accelerator for Volume Rendering , 2000, Euro-Par.