Multidimensional dynamic programming for homology search

Alignment problems in computational biology have been focused recently because of the rapid growth of sequence databases. By computing alignment, we can understand similarity among the sequences. Many systems for alignment have been proposed to date, but most of them are designed for two-dimensional alignment (alignment between two sequences). In this paper, we describe a compact system with an off-the-shelf FPGA board and a host computer for more than three-dimensional alignment based on dynamic programming. In our approach, high performance is achieved (1) by configuring optimal circuit for each dimensional alignment, and (2) by two phase search in each dimension by reconfiguration. In order to realize multidimensional search with a common architecture, two-dimensional dynamic programming is repeated along other dimensions. With this approach, we can minimize the size of units for alignment and achieve high parallelism. Our system with one XC2V6000 enables about 300-fold speedup as compared with single Intel Pentium 4 2GHz processor for four-dimensional alignment, and 100-fold speedup for five-dimensional alignment.

[1]  Akihiko Konagaya,et al.  High Speed Homology Search Using Run-Time Reconfiguration , 2002, FPL.

[2]  Dominique Lavenier SAMBA : Systolic Accelerator for Molecular Biological Applications , 1996 .

[3]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Steven A. Guccione,et al.  Gene Matching Using JBits , 2002, FPL.

[5]  R.K. Singh,et al.  BioSCAN: a VLSI-based system for biosequence analysis , 1991, [1991 Proceedings] IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[6]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[7]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[8]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[9]  Peter M. Athanas,et al.  A run-time reconfigurable system for gene-sequence searching , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[10]  P. Thomas,et al.  Model Validation , 2020, The Big R‐Book.