A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.

Using simulated data, we compared five methods of phylogenetic tree estimation: parsimony, compatibility, maximum likelihood, Fitch-Margoliash, and neighbor joining. For each combination of substitution rates and sequence length, 100 data sets were generated for each of 50 trees, for a total of 5,000 replications per condition. Accuracy was measured by two measures of the distance between the true tree and the estimate of the tree, one measure sensitive to accuracy of branch lengths and the other not. The distance-matrix methods (Fitch-Margoliash and neighbor joining) performed best when they were constrained from estimating negative branch lengths; all comparisons with other methods used this constraint. Parsimony and compatibility had similar results, with compatibility generally inferior; Fitch-Margoliash and neighbor joining had similar results, with neighbor joining generally slightly inferior. Maximum likelihood was the most successful method overall, although for short sequences Fitch-Margoliash and neighbor joining were sometimes better. Bias of the estimates was inferred by measuring whether the independent estimates of a tree for different data sets were closer to the true tree than to each other. Parsimony and compatibility had particular difficulty with inaccuracy and bias when substitution rates varied among different branches. When rates of evolution varied among different sites, all methods showed signs of inaccuracy and bias.

[1]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[2]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[3]  J. Farris,et al.  Quantitative Phyletics and the Evolution of Anurans , 1969 .

[4]  W. J. Quesne,et al.  A Method of Selection of Characters in Numerical Taxonomy , 1969 .

[5]  J. Farris A Successive Approximations Approach to Character Weighting , 1969 .

[6]  P. Meisel Margaret O. Dayhoff: Atlas of Protein Sequence and Structure 1969 (Volume 4) XXIV u. 361 S., 21 Ausklapptafeln, 68 Abb. und zahlreiche Tabellen. National Biomedical Research Foundation, Silver Spring/Maryland 1969. Preis $ 12,50 , 1971 .

[7]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[8]  M. Hasegawa,et al.  MAXIMUM LIKELIHOOD METHOD OF PHYLOGENETIC INFERENCE FROM DNA SEQUENCE DATA , 1984 .

[9]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[10]  G. Olsen,et al.  Earliest phylogenetic branchings: comparing rRNA-based evolutionary trees inferred with various techniques. , 1987, Cold Spring Harbor symposia on quantitative biology.

[11]  P. Sharp,et al.  Reconstruction of phylogenetic trees and estimation of divergence times under nonconstant rates of evolution. , 1987, Cold Spring Harbor symposia on quantitative biology.

[12]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[13]  M. Nei,et al.  Relative efficiencies of the maximum parsimony and distance-matrix methods in obtaining the correct phylogenetic tree. , 1988, Molecular biology and evolution.

[14]  J. A. Studier,et al.  A note on the neighbor-joining algorithm of Saitou and Nei. , 1988, Molecular biology and evolution.

[15]  N. Saitou,et al.  Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree , 1989 .

[16]  Michael D. Hendy,et al.  A Framework for the Quantitative Study of Evolutionary Trees , 1989 .

[17]  L. Jin,et al.  Limitations of the evolutionary parsimony method of phylogenetic analysis. , 1990, Molecular biology and evolution.

[18]  M. Nei,et al.  A Simple Method for Estimating and Testing Minimum-Evolution Trees , 1992 .

[19]  Andrey A. Zharkikh,et al.  Inconsistency of the Maximum-parsimony Method: the Case of Five Taxa With a Molecular Clock , 1993 .

[20]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.