Sequence Analysis With the Kestrel SIMD Parallel Processor

Computer aided sequence analysis is a critical aspect of current biological research. Sequence information from the genome sequencing projects fills databases so quickly that humans cannot examine it all. Hence there is a heavy reliance on computer algorithms to point out the few important nuggets for human examination. Sequence search algorithms range from simple to complex, as does the representation of the biological data. Typically though, simple algorithms are used on the simplest of data representations because of the large computational demands of anything more complex. This leads to missed hits because the simple search techniques are often not sufficiently sensitive. Here we describe the implementation of several sensitive sequence analysis algorithms on the Kestrel parallel processor, a single-instruction multiple-data (SIMD) processor developed and built at UCSC. Performance of the Smith-Waterman and Hidden Markov Model algorithms, with both Viterbi and Expectation Maximization methods ranges from 6 to 20 times faster than standard computers.

[1]  D. Haussler,et al.  Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. , 1998, Journal of molecular biology.

[2]  Richard Hughey,et al.  Reduced space sequence alignment , 1997, Comput. Appl. Biosci..

[3]  Richard Hughey,et al.  Explicit SIMD programming for asynchronous applications , 2000, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors.

[4]  Eric Rice,et al.  The UCSC Kestrel General Purpose Parallel Processor , 1999, PDPTA.

[5]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[6]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[7]  Dominique Lavenier Speeding up genome computations with a systolicacceleratorDominique , 1998 .

[8]  Richard Hughey,et al.  Reduced space hidden Markov model training , 1998, Bioinform..

[9]  Richard Hughey,et al.  Kestrel: A Programmable Array for Sequence Analysis , 1998, J. VLSI Signal Process..

[10]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[11]  Richard Hughey,et al.  Kestrel: A Programmable Array for Sequence Analysis , 1996, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96.

[12]  Richard Hughey Parallel sequence comparison and alignment , 1995, Proceedings The International Conference on Application Specific Array Processors.

[13]  Anders Krogh,et al.  Hidden Markov models for sequence analysis: extension and analysis of the basic method , 1996, Comput. Appl. Biosci..