Comparative biosequence metrics

SummaryThe sequence alignment algorithms of Needleman and Wunsch (1970) and Sellers (1974) are compared. Although the former maximizes similarity and the latter minimizes differences, the two procedures are proven to be equivalent. The equivalence relations necessary for each procedure to give the same result are: 1, the weight assigned to gaps in the Sellers algorithm exceed that in the Needleman-Wunsch algorithm by exactly half the length of the gap times the maximum match value; and 2, for any pair of aligned elements, the degree of similarity assigned by the Needleman-Wunsch algorithm plus the degree of dissimilarity assigned by the Sellers algorithm equal a constant. The utility of the algorithms is independent of the nature of the elements in the sequence and could include anything from geological sequences to the amino acid sequences of proteins. Examples are provided using known nucleotide sequences, one of which shows two sequences to be analogous rather than homologous.

[1]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[2]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[3]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[4]  P. Sellers On the Theory and Computation of Evolutionary Distances , 1974 .

[5]  D. Pribnow Nucleotide sequence of an RNA polymerase binding site at an early T7 promoter. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[6]  W. A. Beyer,et al.  Some Biological Sequence Metrics , 1976 .

[7]  M. Rosenberg,et al.  Determination of nucleotide sequences beyond the sites of transcriptional termination. , 1976, Proceedings of the National Academy of Sciences of the United States of America.

[8]  W. Fitch Phylogenies constrained by the crossover process as illustrated by human hemoglobins and a thirteen-cycle, eleven-amino-acid repeat in human apolipoprotein A-I. , 1977, Genetics.

[9]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[10]  D Court,et al.  Regulatory sequences involved in the promotion and termination of RNA transcription. , 1979, Annual review of genetics.

[11]  D. Capon,et al.  dnaG (primase)-dependent origins of DNA replication. Nucleotide sequences of the negative strand initiation sites of bacteriophages St-1, phi K, and alpha 3. , 1979, The Journal of biological chemistry.

[12]  Temple F. Smith,et al.  New Stratigraphic Correlation Techniques , 1980, The Journal of Geology.

[13]  R. E. Dickerson Cytochrome c and the evolution of energy metabolism. , 1980, Scientific American.

[14]  R. Dickerson Evolution and gene transfer in purple photosynthetic bacteria , 1980, Nature.

[15]  Phylogenies from amino acid sequences aligned with gaps: The problem of gap weighting , 1975, Journal of Molecular Evolution.