Fast Algorithm for Sequence Edit Distance Computation

Dynamic programming plays an important role in biomedical signal processing and it can determine the similarity between two nucleotide sequences. In this paper, we propose an algorithm to improve the computation efficiency of dynamic programming for global edit distance computation. Computation efficiency is a very important issue for dynamic programming since in practice the length of a nucleotide sequence is 1,000–100,000. Therefore, we propose several techniques, including the slope rule and the magic number rule, to simplify the procedure of dynamic programming. Simulations show that, with these rules, only 10% of the entries in the dynamic programming matrix should be computed and the computation time is much less than that of the original algorithm.

[1]  James M. Bower,et al.  Computational modeling of genetic and biochemical networks , 2001 .

[2]  Aaron Davidson,et al.  A fast pruning algorithm for optimal sequence alignment , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[3]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[4]  Michael S. Waterman,et al.  Introduction to computational biology , 1995 .

[5]  Nicholas L. Bray,et al.  AVID: A global alignment program. , 2003, Genome research.

[6]  Jian Ye,et al.  BLAST: improvements for better sequence analysis , 2006, Nucleic Acids Res..

[7]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[8]  W R Pearson,et al.  Using the FASTA program to search protein and DNA sequence databases. , 1994, Methods in molecular biology.