Aligning a DNA Sequence with a Protein Sequence

We develop several algorithms for the problem of aligning DNA sequence with a protein sequence. Our methods account for frameshift errors, but not for introns in the DNA sequence. Thus, they are particularly appropriate for comparing a cDNA sequence that suffers from sequencing errors with an amino acid sequence or a protein sequence database. We describe algorithms for computing optimal alignments for several definitions of DNA-protein alignment, verify sufficient conditions for equivalence of certain definitions, describe techniques for efficient implementation, and discuss experience with these ideas in a new release of the FASTA suite of database-searching programs.

[1]  J Hein,et al.  An algorithm combining DNA and protein alignment. , 1994, Journal of theoretical biology.

[2]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[3]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[5]  Hans Söderlund,et al.  Algorithms for the search of amino acid patterns in nucleic acid sequences , 1986, Nucleic Acids Res..