The problem of rooting rapid radiations.

There are many examples of groups (such as birds, bees, mammals, multicellular animals, and flowering plants) that have undergone a rapid radiation. In such cases, where there is a combination of short internal and long external branches, correctly estimating and rooting phylogenetic trees is known to be a difficult problem. In this simulation study, we tested the performances of different phylogenetic methods at estimating a tree that models a rapid radiation. We found that maximum likelihood, corrected and uncorrected neighbor-joining, and corrected and uncorrected parsimony, all suffer from biases toward specific tree topologies. In addition, we found that using a single-taxon outgroup to root a tree frequently disrupts an otherwise correct ingroup phylogeny. Moreover, for uncorrected parsimony, we found cases where several individual trees (in which the outgroup was placed incorrectly) were selected more frequently than the correct tree. Even for parameter settings where the correct tree was selected most frequently when using extremely long sequences, for sequences of up to 60,000 nucleotides the incorrectly rooted trees were each selected more frequently than the correct tree. For all the cases tested here, tree estimation using a two taxon outgroup was more accurate than when using a single-taxon outgroup. However, the ingroup was most accurately recovered when no outgroup was used.

[1]  H. Munro,et al.  Mammalian protein metabolism , 1964 .

[2]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[3]  A Gajdos,et al.  [Evolution of protein molecules. I. Protein synthesis]. , 1972, La Nouvelle presse medicale.

[4]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[5]  David Penny,et al.  Comparing Trees with Pendant Vertices Labelled , 1984 .

[6]  C. Krimbas,et al.  Accuracy of phylogenetic trees estimated from DNA sequence data. , 1987, Molecular biology and evolution.

[7]  N. Saitou,et al.  Relative Efficiencies of the Fitch-Margoliash, Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree , 1989 .

[8]  Michael D. Hendy,et al.  A Framework for the Quantitative Study of Evolutionary Trees , 1989 .

[9]  M. Miyamoto,et al.  Phylogenetic Analysis of DNA Sequences , 1991 .

[10]  Michael D. Hendy,et al.  Parsimony Can Be Consistent , 1993 .

[11]  D. Penny,et al.  Spectral analysis of phylogenetic data , 1993 .

[12]  M. Steel,et al.  Corrected Parsimony, Minimum Evolution, and Hadamard Conjugations , 1996 .

[13]  Z. Yang,et al.  How often do wrong models produce better phylogenies? , 1997, Molecular biology and evolution.

[14]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[15]  M. Siddall,et al.  Success of Parsimony in the Four‐Taxon Case: Long‐Branch Repulsion by Likelihood in the Farris Zone , 1998 .

[16]  D. Swofford,et al.  Taxon sampling revisited , 1999, Nature.

[17]  W. Bruno,et al.  Topological bias and inconsistency of maximum likelihood using wrong models. , 1999, Molecular biology and evolution.

[18]  M. Nei,et al.  Molecular Evolution and Phylogenetics , 2000 .

[19]  M. Steel,et al.  Distributions of cherries for two models of trees. , 2000, Mathematical biosciences.

[20]  D. Swofford,et al.  Should we use model-based methods for phylogenetic inference when we know that assumptions about among-site rate variation and nucleotide substitution pattern are violated? , 2001, Systematic biology.

[21]  P. Lockhart,et al.  Trees for bees. , 2001, Trends in ecology & evolution.

[22]  J. S. Rogers,et al.  Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. , 2001, Systematic biology.

[23]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[24]  David Penny,et al.  Four new mitochondrial genomes and the increased stability of evolutionary trees of mammals from improved taxon sampling. , 2002, Molecular biology and evolution.

[25]  D. Penny,et al.  Pika and vole mitochondrial genomes increase support for both rodent monophyly and glires. , 2002, Gene.

[26]  D. Penny,et al.  Two new avian mitochondrial genomes (penguin and goose) and a summary of bird and reptile mitogenomic features. , 2003, Gene.

[27]  D. Penny,et al.  Outgroup misplacement and phylogenetic inaccuracy under a molecular clock--a simulation study. , 2003, Systematic biology.

[28]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[29]  Timothy J. Harlow,et al.  Bayesian and maximum likelihood phylogenetic analyses of protein sequence data under relative branch-length differences and model violation , 2005, BMC Evolutionary Biology.

[30]  S. Ho,et al.  Tracing the decay of the historical signal in biological sequence data. , 2004, Systematic biology.

[31]  J. Palmer,et al.  Long branch attraction, taxon sampling, and the earliest angiosperms: Amborella or monocots? , 2004, BMC Evolutionary Biology.

[32]  Pamela S Soltis,et al.  Genome-scale data, angiosperm relationships, and "ending incongruence": a cautionary tale in phylogenetics. , 2004, Trends in plant science.

[33]  D. Penny,et al.  Four new avian mitochondrial genomes help get to basic evolutionary questions in the late cretaceous. , 2004, Molecular biology and evolution.

[34]  H. Philippe,et al.  Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. , 2005, Molecular biology and evolution.

[35]  B. Holland,et al.  Analysis of Acorus calamus chloroplast genome and its phylogenetic implications. , 2005, Molecular biology and evolution.

[36]  J. McInerney,et al.  The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. , 2005, Molecular biology and evolution.

[37]  Jack Sullivan,et al.  Model Selection in Phylogenetics , 2005 .

[38]  D. Penny,et al.  The place of Amborella within the radiation of angiosperms. , 2005, Trends in plant science.

[39]  J. Bergsten A review of long‐branch attraction , 2005, Cladistics : the international journal of the Willi Hennig Society.

[40]  Jim Leebens-Mack,et al.  Identifying the basal angiosperm node in chloroplast genome phylogenies: sampling one's way out of the Felsenstein zone. , 2005, Molecular biology and evolution.

[41]  Mike Steel,et al.  The Bayesian "star paradox" persists for long finite sequences. , 2006, Molecular biology and evolution.