Comparing bootstrap and posterior probability values in the four-taxon case.

Assessment of the reliability of a given phylogenetic hypothesis is an important step in phylogenetic analysis. Historically, the nonparametric bootstrap procedure has been the most frequently used method for assessing the support for specific phylogenetic relationships. The recent employment of Bayesian methods for phylogenetic inference problems has resulted in clade support being expressed in terms of posterior probabilities. We used simulated data and the four-taxon case to explore the relationship between nonparametric bootstrap values (as inferred by maximum likelihood) and posterior probabilities (as inferred by Bayesian analysis). The results suggest a complex association between the two measures. Three general regions of tree space can be identified: (1) the neutral zone, where differences between mean bootstrap and mean posterior probability values are not significant, (2) near the two-branch corner, and (3) deep in the two-branch corner. In the last two regions, significant differences occur between mean bootstrap and mean posterior probability values. Whether bootstrap or posterior probability values are higher depends on the data in support of alternative topologies. Examination of star topologies revealed that both bootstrap and posterior probability values differ significantly from theoretical expectations; in particular, there are more posterior probability values in the range 0.85-1 than expected by theory. Therefore, our results corroborate the findings of others that posterior probability values are excessively high. Our results also suggest that extrapolations from single topology branch-length studies are unlikely to provide any general conclusions regarding the relationship between bootstrap and posterior probability values.

[1]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[2]  J. Huelsenbeck,et al.  SUCCESS OF PHYLOGENETIC METHODS IN THE FOUR-TAXON CASE , 1993 .

[3]  Michael J. Sanderson,et al.  Objections to Bootstrapping Phylogenies: A Critique , 1995 .

[4]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[5]  M A Newton,et al.  Bayesian Phylogenetic Inference via Markov Chain Monte Carlo Methods , 1999, Biometrics.

[6]  Y. Tateno,et al.  Robustness of maximum likelihood tree estimation against different patterns of base substitutions , 2005, Journal of Molecular Evolution.

[7]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[8]  B. Manly Randomization, Bootstrap and Monte Carlo Methods in Biology , 2018 .

[9]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[10]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[11]  B. Efron,et al.  Bootstrap confidence levels for phylogenetic trees. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[12]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[13]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[14]  J. Wakeley Substitution rate variation among sites in hypervariable region 1 of human mitochondrial DNA , 1993, Journal of Molecular Evolution.

[15]  N. Goldman,et al.  Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. , 1994, Molecular biology and evolution.

[16]  Masatoshi Nei,et al.  Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[18]  M. P. Cummings,et al.  Sampling properties of DNA sequence data in phylogenetic analysis. , 1995, Molecular biology and evolution.

[19]  Joseph Felsenstein,et al.  Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull , 1993 .

[20]  J. Gentle,et al.  Randomization and Monte Carlo Methods in Biology. , 1990 .

[21]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[22]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[23]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[24]  Bob Mau,et al.  Markov chain Monte Carlo for the Bayesian analysis of evolutionary trees from aligned molecular sequences , 1999 .

[25]  B. Rannala,et al.  Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. , 1997, Molecular biology and evolution.

[26]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[27]  P. Lewis,et al.  Success of maximum likelihood phylogeny inference in the four-taxon case. , 1995, Molecular biology and evolution.

[28]  S. Jeffery Evolution of Protein Molecules , 1979 .

[29]  P. Lewis,et al.  Phylogenetic systematics turns over a new leaf. , 2001, Trends in ecology & evolution.

[30]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[31]  F. Ayala Molecular systematics , 2004, Journal of Molecular Evolution.

[32]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[33]  H. Hartley,et al.  Tests of significance in harmonic analysis. , 1949, Biometrika.

[34]  C. Cunningham,et al.  The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support. , 2002, Molecular biology and evolution.

[35]  B. Larget,et al.  Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees , 2000 .

[36]  J. Felsenstein,et al.  A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. , 1994, Molecular biology and evolution.

[37]  S. Tavaré Some probabilistic and statistical problems in the analysis of DNA sequences , 1986 .

[38]  W. Doolittle,et al.  Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability. , 2003, Molecular biology and evolution.

[39]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[40]  A. Leaché,et al.  Molecular systematics of the Eastern Fence Lizard (Sceloporus undulatus): a comparison of Parsimony, Likelihood, and Bayesian approaches. , 2002, Systematic biology.

[41]  Peter Arensburger,et al.  Combined data, Bayesian phylogenetics, and the origin of the New Zealand cicada genera. , 2002, Systematic biology.

[42]  S P Otto,et al.  Genes and other samples of DNA sequence data for phylogenetic inference. , 1999, The Biological bulletin.

[43]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[44]  Daniel S. Myers,et al.  Necessity is the mother of invention: a simple grid computing system using commodity tools , 2003, J. Parallel Distributed Comput..

[45]  D. Winkler,et al.  Phylogeny of the tree swallow genus, Tachycineta (Aves: Hirundinidae), by Bayesian analysis of mitochondrial DNA sequences. , 2002, Molecular phylogenetics and evolution.

[46]  H. A. David,et al.  Order Statistics (2nd ed). , 1981 .

[47]  Jonathan P. Bollback,et al.  Bayesian Inference of Phylogeny and Its Impact on Evolutionary Biology , 2001, Science.

[48]  F. Lutzoni,et al.  Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence. , 2003, Molecular biology and evolution.

[49]  S. Hedges The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies. , 1992, Molecular biology and evolution.