Outgroup misplacement and phylogenetic inaccuracy under a molecular clock--a simulation study.

We conducted a simulation study of the phylogenetic methods UPGMA, neighbor joining, maximum parsimony, and maximum likelihood for a five-taxon tree under a molecular clock. The parameter space included a small region where maximum parsimony is inconsistent, so we tested inconsistency correction for parsimony and distance correction for neighbor joining. As expected, corrected parsimony was consistent. For these data, maximum likelihood with the clock assumption outperformed each of the other methods tested. The distance-based methods performed marginally better than did maximum parsimony and maximum likelihood without the clock assumption. Data correction was generally detrimental to accuracy, especially for short sequence lengths. We identified another region of the parameter space where, although consistent for a given method, some incorrect trees were each selected with up to twice the frequency of the correct (generating) tree for sequences of bounded length. These incorrect trees are those where the outgroup has been incorrectly placed. In addition to this problem, the placement of the outgroup sequence can have a confounding effect on the ingroup tree, whereby the ingroup is correct when using the ingroup sequences alone, but with the inclusion of the outgroup the ingroup tree becomes incorrect.

[1]  J. Farris A Probability Model for Inferring Evolutionary Trees , 1973 .

[2]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[3]  M. Kimura Estimation of evolutionary distances between homologous nucleotide sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C. Krimbas,et al.  Accuracy of phylogenetic trees estimated from DNA sequence data. , 1987, Molecular biology and evolution.

[5]  N. Saitou,et al.  Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree , 1989 .

[6]  Michael D. Hendy,et al.  A Framework for the Quantitative Study of Evolutionary Trees , 1989 .

[7]  Michael D. Hendy,et al.  Parsimony Can Be Consistent , 1993 .

[8]  F. Tajima,et al.  Unbiased estimation of evolutionary distance between nucleotide sequences. , 1993, Molecular biology and evolution.

[9]  Junhyong Kim Improving the Accuracy of Phylogenetic Estimation by Combining Different Methods , 1993 .

[10]  A. Smith,et al.  Rooting molecular trees: problems and strategies , 1994 .

[11]  M. Nei,et al.  Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site. , 1994, Molecular biology and evolution.

[12]  M. Charleston Factors affecting the performance of phylogenetic methods : a thesis presented in partial fulfilment of the requirements for the degree of Ph.D. in Mathematics at Massey University , 1994 .

[13]  Michael D. Hendy,et al.  The sampling distributions and covariance matrix of phylogenetic spectra , 1994 .

[14]  J. Huelsenbeck,et al.  Application and accuracy of molecular phylogenies. , 1994, Science.

[15]  MICHAEL A. CHARLESTON,et al.  The Effects of Sequence Length, Tree Topology, and Number of Taxa on the Performance of Phylogenetic Methods , 1994, J. Comput. Biol..

[16]  D Penny,et al.  A discrete Fourier analysis for evolutionary trees. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[17]  B. Efron,et al.  Bootstrap confidence levels for phylogenetic trees. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Z. Yang,et al.  How often do wrong models produce better phylogenies? , 1997, Molecular biology and evolution.

[19]  D Penny,et al.  Mass Survival of Birds Across the Cretaceous- Tertiary Boundary: Molecular Evidence , 1997, Science.

[20]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[21]  James Lyons-Weiler,et al.  Finding optimal ingroup topologies and convexities when the choice of outgroups is not obvious. , 1998, Molecular phylogenetics and evolution.

[22]  James Lyons-Weiler,et al.  Optimal outgroup analysis , 1998 .

[23]  M. Hasegawa,et al.  Interordinal relationships of birds and other reptiles based on whole mitochondrial genomes. , 1999, Systematic biology.

[24]  W. Bruno,et al.  Topological bias and inconsistency of maximum likelihood using wrong models. , 1999, Molecular biology and evolution.

[25]  D. Soltis,et al.  Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology , 1999, Nature.

[26]  Mark W. Chase,et al.  The earliest angiosperms: evidence from mitochondrial, plastid and nuclear genomes , 1999, Nature.

[27]  J. Kim,et al.  Slicing hyperdimensional oranges: the geometry of phylogenetic estimation. , 2000, Molecular phylogenetics and evolution.

[28]  Louis J. Billera,et al.  Geometry of the Space of Phylogenetic Trees , 2001, Adv. Appl. Math..

[29]  D. Posada,et al.  Simple (wrong) models for complex trees: a case from retroviridae. , 2001, Molecular biology and evolution.

[30]  Barbara R. Holland,et al.  Evolutionary analyses of large data sets: Trees and beyond , 2001 .

[31]  P. Lockhart,et al.  Trees for bees. , 2001, Trends in ecology & evolution.

[32]  J. S. Rogers,et al.  Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. , 2001, Systematic biology.

[33]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[34]  David Penny,et al.  Four new mitochondrial genomes and the increased stability of evolutionary trees of mammals from improved taxon sampling. , 2002, Molecular biology and evolution.