Optimization Alignment:Down,Up,Error,and Improvements

Optimization Alignment (OA) is a method for taking unaligned sequences and creating parsimonious cladograms without the use of multiple alignment. The method consists of two parts. First, a “down-pass” that moves “down” the tree from the terminal taxa (tips) to the root or base of the cladogram and, second, an “up-pass” which moves back up from the base to the tips. The down-pass creates preliminary (i. e., provisional) hypothetical ancestral sequences at the cladogram nodes and generates the cladogram length as a weighted sum of the character transformations (nucleotide substitutions and insertion-deletion events) required by the observed (terminal) sequences. The up-pass takes the information from the down-pass and creates the “final” estimates of the hypothetical ancestral sequences. From these the most parsimonious synapomorphy scheme can be derived to how which character transformation events characterize the various lineages on the tree. The combination of these two procedures allows phylogenetic searches to take place on unaligned sequence data, resulting in improvements in execution time and quality of results. This process differs from multiple alignment procedures (such as that of Sankoff and Cedergren [1]) in that OA attempts to determine the most parsimonious cost of a homology schemes would seem to be strengths. The heuristic nature of the cladogram length and ancestral sequence reconstruction would seem to be weaknesses. These can be improved, however, as described above. Although the problem is unlikely to be solved exactly, improvements along the lines suggested here could well bring incremental benefits and, combined with ideas of others, generate more satisfactory methods and more reliable results.

[1]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[2]  Fixed Character States and the Optimization of Molecular Sequence Data , 1999 .

[3]  Pablo A. Goloboff,et al.  Tree Searches Under Sankoff Parsimony , 1998, Cladistics : the international journal of the Willi Hennig Society.

[4]  Pablo A. Goloboff,et al.  CHARACTER OPTIMIZATION AND CALCULATION OF TREE LENGTHS , 1993 .

[5]  Gonzalo Giribet Exploring the behavior of POY, a program for direct optimization of molecular data. , 2001 .

[6]  J. Farris Methods for Computing Wagner Trees , 1970 .

[7]  G. Nelson,et al.  3 – HOMOLOGY AND SYSTEMATICS , 1994 .

[8]  D. Sankoff,et al.  Locating the vertices of a Steiner tree in arbitrary space , 1975 .

[9]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[10]  W. Wheeler OPTIMIZATION ALIGNMENT: THE END OF MULTIPLE SEQUENCE ALIGNMENT IN PHYLOGENETICS? , 1996 .

[11]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[12]  David Sankoff,et al.  Locating the vertices of a steiner tree in an arbitrary metric space , 1975, Math. Program..

[13]  A. Phillips,et al.  Multiple sequence alignment in phylogenetic analysis. , 2000, Molecular phylogenetics and evolution.

[14]  W. Wheeler,et al.  MALIGN: A Multiple Sequence Alignment Program , 1994 .