Towards Integration of Multiple Alignment and Phylogenetic Tree Construction

A central problem in the study of molecular evolution is the reconstruction of the history of a set of biological sequences in the form of a phylogenetic tree. One step in calculating this tree is the computation of a multiple alignment. Most existing approaches treat the two problems of multiple alignment and tree construction as separate while in fact they influence each other. Based on three-way alignments of pre-aligned groups of sequences we adapt a commonly used tree construction procedure to produce both tree and multiple alignment simultaneously. In contrast to existing iterative algorithms the new method can change alignments made early in the course of the computation at a later stage. A sufficient criterion to prevent the introduction of edges with negative length reduces the number of three-way alignments that need to be computed. Applications of the new approach to the alignment of protein and of nucleic acid sequences are presented.

[1]  W. A. Beyer,et al.  Additive evolutionary trees. , 1977, Journal of theoretical biology.

[2]  R. Debry,et al.  The consistency of several phylogeny-inference methods under varying evolutionary rates. , 1992, Molecular biology and evolution.

[3]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[4]  M Vingron,et al.  Weighting in sequence space: a comparison of methods in terms of generalized sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Osamu Gotoh,et al.  Optimal alignment between groups of sequences and its application to multiple sequence alignment , 1993, Comput. Appl. Biosci..

[6]  William R. Taylor,et al.  Multiple sequence alignment by a pairwise algorithm , 1987, Comput. Appl. Biosci..

[7]  P. Sneath,et al.  Numerical Taxonomy , 1962, Nature.

[8]  D. Lipman,et al.  Trees, stars, and multiple biological sequence alignment , 1989 .

[9]  Xiaoqiu Huang Alignment of three sequences in quadratic space , 1993, SIAP.

[10]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[11]  W. H. Day Computational complexity of inferring phylogenies from dissimilarity matrices. , 1987, Bulletin of mathematical biology.

[12]  J A Lake,et al.  The order of sequence alignment can bias the selection of tree topology. , 1991, Molecular biology and evolution.

[13]  D. K. Y. Chiu,et al.  A survey of multiple sequence comparison methods , 1992 .

[14]  Michael S. Waterman,et al.  General methods of sequence comparison , 1984 .

[15]  M. Waterman,et al.  Line geometries for sequence comparisons , 1984 .

[16]  Tao Jiang,et al.  Aligning sequences via an evolutionary tree: complexity and approximation , 1994, STOC '94.

[17]  M. A. McClure,et al.  Comparative analysis of multiple protein-sequence alignment methods. , 1994, Molecular biology and evolution.

[18]  W R Taylor,et al.  Deriving an amino acid distance matrix. , 1993, Journal of theoretical biology.

[19]  Tao Jiang,et al.  On the Complexity of Multiple Sequence Alignment , 1994, J. Comput. Biol..

[20]  D. Sankoff Minimal Mutation Trees of Sequences , 1975 .

[21]  J. Barciszewski,et al.  Compilation of 5S rRNA and 5S rRNA gene sequences. , 1997, Nucleic acids research.