Inferring speciation and extinction rates under different sampling schemes.

The birth-death process is widely used in phylogenetics to model speciation and extinction. Recent studies have shown that the inferred rates are sensitive to assumptions about the sampling probability of lineages. Here, we examine the effect of the method used to sample lineages. Whereas previous studies have assumed random sampling (RS), we consider two extreme cases of biased sampling: "diversified sampling" (DS), where tips are selected to maximize diversity and "cluster sampling (CS)," where sample diversity is minimized. DS appears to be standard practice, for example, in analyses of higher taxa, whereas CS may occur under special circumstances, for example, in studies of geographically defined floras or faunas. Using both simulations and analyses of empirical data, we show that inferred rates may be heavily biased if the sampling strategy is not modeled correctly. In particular, when a diversified sample is treated as if it were a random or complete sample, the extinction rate is severely underestimated, often close to 0. Such dramatic errors may lead to serious consequences, for example, if estimated rates are used in assessing the vulnerability of threatened species to extinction. Using Bayesian model testing across 18 empirical data sets, we show that DS is commonly a better fit to the data than complete, random, or cluster sampling (CS). Inappropriate modeling of the sampling method may at least partly explain anomalous results that have previously been attributed to variation over time in birth and death rates.

[1]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[2]  Elizabeth A. Thompson,et al.  Human Evolutionary Trees , 1975 .

[3]  R M May,et al.  The reconstructed evolutionary process. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4]  T. Stadler Sampling-through-time in birth-death trees. , 2010, Journal of theoretical biology.

[5]  R M May,et al.  Extinction rates can be estimated from molecular phylogenies. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[6]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[7]  Frederick Albert Matsen IV,et al.  A method for investigating relative timing information on phylogenetic trees. , 2009, Systematic biology.

[8]  Andrew R. Francis,et al.  Using Approximate Bayesian Computation to Estimate Tuberculosis Transmission Parameters From Genotype Data , 2006, Genetics.

[9]  S. Renner,et al.  Slowdowns in diversification rates from real phylogenies may not be real. , 2010, Systematic biology.

[10]  Tanja Gernhard,et al.  The conditioned reconstructed process. , 2008, Journal of theoretical biology.

[11]  J. Wakeley Complex speciation of humans and chimpanzees , 2008, Nature.

[12]  H. Philippe,et al.  Computing Bayes factors using thermodynamic integration. , 2006, Systematic biology.

[13]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[14]  A critical branching process model for biodiversity , 2004, Advances in Applied Probability.

[15]  A. Purvis Phylogenetic Approaches to the Study of Extinction , 2008 .

[16]  D. Kendall On the Generalized "Birth-and-Death" Process , 1948 .

[17]  B. Rannala,et al.  Probability distribution of molecular evolutionary trees: A new method of phylogenetic inference , 1996, Journal of Molecular Evolution.

[18]  Oliver G. Pybus,et al.  Testing macro–evolutionary models using incomplete molecular phylogenies , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[19]  A. Phillimore,et al.  Density-Dependent Cladogenesis in Birds , 2008, PLoS biology.

[20]  Virginia Held,et al.  Birth and Death , 1989, Ethics.

[21]  B. Rannala,et al.  Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. , 1997, Molecular biology and evolution.

[22]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[23]  Y. Pawitan In all likelihood : statistical modelling and inference using likelihood , 2002 .

[24]  T. Stadler On incomplete sampling under birth-death models and connections to the sampling-based coalescent. , 2009, Journal of theoretical biology.

[25]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[26]  D. Kendall Stochastic Processes and Population Growth , 1949 .

[27]  John Geweke,et al.  Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments , 1991 .

[28]  Sean Nee,et al.  Birth-Death Models in Macroevolution , 2006 .

[29]  Daniel L Rabosky,et al.  EXTINCTION RATES SHOULD NOT BE ESTIMATED FROM MOLECULAR PHYLOGENIES , 2010, Evolution; international journal of organic evolution.

[30]  A. Rambaut TRACER v1.5 , 2009 .

[31]  Charles R Marshall,et al.  Diversity dynamics: molecular phylogenies need the fossil record. , 2010, Trends in ecology & evolution.

[32]  I. Lovette,et al.  Explosive Evolutionary Radiations: Decreasing Speciation or Increasing Extinction Through Time? , 2008, Evolution; international journal of organic evolution.

[33]  Sean Nee,et al.  PHYLOGENIES WITHOUT FOSSILS , 1994, Evolution; international journal of organic evolution.

[34]  R. Ricklefs,et al.  Estimating diversification rates from phylogenetic information. , 2007, Trends in ecology & evolution.