Impact of Phylogenetic Tree Completeness and Mis-specification of Sampling Fractions on Trait Dependent Diversification Models

Abstract Understanding the origins of diversity and the factors that drive some clades to be more diverse than others are important issues in evolutionary biology. Sophisticated SSE (state-dependent speciation and extinction) models provide insights into the association between diversification rates and the evolution of a trait. The empirical data used in SSE models and other methods is normally imperfect, yet little is known about how this can affect these models. Here, we evaluate the impact of common phylogenetic issues on inferences drawn from SSE models. Using simulated phylogenetic trees and trait information, we fitted SSE models to determine the effects of sampling fraction (phylogenetic tree completeness) and sampling fraction mis-specification on model selection and parameter estimation (speciation, extinction, and transition rates) under two sampling regimes (random and taxonomically biased). As expected, we found that both model selection and parameter estimate accuracies are reduced at lower sampling fractions (i.e., low tree completeness). Furthermore, when sampling of the tree is imbalanced across sub-clades and tree completeness is ≤ 60%, rates of false positives increase and parameter estimates are less accurate, compared to when sampling is random. Thus, when applying SSE methods to empirical datasets, there are increased risks of false inferences of trait dependent diversification when some sub-clades are heavily under-sampled. Mis-specifying the sampling fraction severely affected the accuracy of parameter estimates: parameter values were over-estimated when the sampling fraction was specified as lower than its true value, and under-estimated when the sampling fraction was specified as higher than its true value. Our results suggest that it is better to cautiously under-estimate sampling efforts, as false positives increased when the sampling fraction was over-estimated. We encourage SSE studies where the sampling fraction can be reasonably estimated and provide recommended best practices for SSE modeling. [Trait dependent diversification; SSE models; phylogenetic tree completeness; sampling fraction.]

[1]  B. O’Meara,et al.  A flexible method for estimating tip diversification rates across a range of speciation and extinction scenarios , 2022, bioRxiv.

[2]  L. Nagy,et al.  Developmental innovations promote species diversification in mushroom-forming fungi , 2021, bioRxiv.

[3]  G. Ortí,et al.  Evolutionary determinism and convergence associated with water-column transitions in marine fishes , 2020, Proceedings of the National Academy of Sciences.

[4]  R. A. Pyron,et al.  Specialized breeding in plants affects diversification trajectories in Neotropical frogs , 2020, Evolution; international journal of organic evolution.

[5]  D. Rabosky,et al.  Estimating diversification rates on incompletely-sampled phylogenies: theoretical concerns and practical solutions. , 2019, Systematic biology.

[6]  William A. Freyman,et al.  Interaction among ploidy, breeding system and lineage diversification. , 2019, The New phytologist.

[7]  Andrew J. Alverson,et al.  Diatoms diversify and turn over faster in freshwater than marine environments * , 2019, Evolution; international journal of organic evolution.

[8]  M. Rausher,et al.  Adaptation to hummingbird pollination is associated with reduced diversification in Penstemon , 2019, Evolution letters.

[9]  J. K. Dickens,et al.  The butterflies of a Cerrado–Atlantic Forest ecotone at Laguna Blanca reveal underestimation of Paraguayan butterfly diversity and need for conservation , 2019, Journal of Insect Conservation.

[10]  Matthew W. Pennell,et al.  Macroevolutionary diversification rates show time dependency , 2018, Proceedings of the National Academy of Sciences.

[11]  R. Etienne,et al.  Detecting the Dependence of Diversification on Multiple Traits from Phylogenetic Trees and Trait Data , 2018, Systematic biology.

[12]  Sebastián Duchêne,et al.  BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis , 2018, bioRxiv.

[13]  H. Mayrhofer,et al.  The evolution of fungal substrate specificity in a widespread group of crustose lichens , 2018, Proceedings of the Royal Society B.

[14]  Daniel S. Caetano,et al.  Hidden state models improve state‐dependent diversification approaches, including biogeographical models , 2018, Evolution; international journal of organic evolution.

[15]  Rafe M. Brown,et al.  Comprehensive multi-locus phylogeny of Old World tree frogs (Anura: Rhacophoridae) reveals taxonomic uncertainties and potential cases of over- and underestimation of species diversity. , 2018, Molecular phylogenetics and evolution.

[16]  H. Letsch,et al.  Climate and host‐plant associations shaped the evolution of ceutorhynch weevils throughout the Cenozoic , 2018, Evolution; international journal of organic evolution.

[17]  D. Harris,et al.  Hidden in the Arabian Mountains: Multilocus phylogeny reveals cryptic diversity in the endemic Omanosaura lizards , 2018 .

[18]  Jake L. Snaddon,et al.  Scientific research on animal biodiversity is systematically biased towards vertebrates and temperate regions , 2017, PloS one.

[19]  I. Sanmartín,et al.  Testing the role of the Red Queen and Court Jester as drivers of the macroevolution of Apollo butterflies , 2017, bioRxiv.

[20]  P. Grandcolas,et al.  Taxonomic bias in biodiversity data and societal preferences , 2017, Scientific Reports.

[21]  Timothy J. S. Whitfeld,et al.  Widespread sampling biases in herbaria revealed from large-scale digitization , 2017, bioRxiv.

[22]  Eleanor F. Miller,et al.  Seed size and its rate of evolution correlate with species diversification across angiosperms , 2016, bioRxiv.

[23]  William A. Freyman,et al.  Cladogenetic and Anagenetic Models of Chromosome Number Evolution: a Bayesian Model Averaging Approach , 2016, bioRxiv.

[24]  Brian C O'Meara,et al.  Detecting hidden diversification shifts in models of trait-dependent speciation and extinction , 2015, bioRxiv.

[25]  L. Bromham,et al.  Is specialization an evolutionary dead end? Testing for differences in speciation, extinction and trait transition rates across diverse phylogenies of specialists and generalists , 2016, Journal of evolutionary biology.

[26]  Michael J. Landis,et al.  RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language , 2016, Systematic biology.

[27]  B. Laenen,et al.  Increased diversification rates follow shifts to bisexuality in liverworts. , 2016, The New phytologist.

[28]  Alexander S. T. Papadopulos,et al.  Viviparity stimulates diversification in an order of fish , 2016, Nature Communications.

[29]  H. Letsch,et al.  Not going with the flow: a comprehensive time‐calibrated phylogeny of dragonflies (Anisoptera: Odonata: Insecta) provides evidence for the role of lentic habitats on diversification , 2016, Molecular ecology.

[30]  R. Corlett Plant diversity in a changing world: Status, trends, and conservation needs , 2016, Plant diversity.

[31]  B. Looney,et al.  Into and out of the tropics: global diversification patterns in a hyperdiverse clade of ectomycorrhizal fungi , 2016, Molecular ecology.

[32]  A. Mchardy,et al.  Coupling of diversification and pH adaptation during the evolution of terrestrial Thaumarchaeota , 2015, Proceedings of the National Academy of Sciences.

[33]  D. Rabosky,et al.  Model inadequacy and mistaken inferences of trait-dependent speciation. , 2014, Systematic biology.

[34]  A. Agrawal,et al.  Defense mutualisms enhance plant diversification , 2014, Proceedings of the National Academy of Sciences.

[35]  S. Reddy What's missing from avian global diversification analyses? , 2014, Molecular phylogenetics and evolution.

[36]  J. L. Gittleman,et al.  The biodiversity of species and their rates of extinction, distribution, and protection , 2014, Science.

[37]  Daniele Silvestro,et al.  DISENTANGLING THE EFFECTS OF KEY INNOVATIONS ON THE DIVERSIFICATION OF BROMELIOIDEAE (BROMELIACEAE) , 2014, Evolution; international journal of organic evolution.

[38]  John P. Huelsenbeck,et al.  Probabilistic Graphical Model Representation in Phylogenetics , 2013, Systematic biology.

[39]  M. Leal,et al.  Global sampling of plant roots expands the described molecular diversity of arbuscular mycorrhizal fungi , 2013, Mycorrhiza.

[40]  P. Midford,et al.  Exploring power and parameter estimation of the BiSSE method for analyzing species diversification , 2013, BMC Evolutionary Biology.

[41]  R. FitzJohn Diversitree: comparative phylogenetic analyses of diversification in R , 2012 .

[42]  Tanja Stadler,et al.  Inferring speciation and extinction rates under different sampling schemes. , 2011, Molecular biology and evolution.

[43]  Carlos J. Melián,et al.  NEUTRAL BIODIVERSITY THEORY CAN EXPLAIN THE IMBALANCE OF PHYLOGENETIC TREES BUT NOT THE TEMPO OF THEIR DIVERSIFICATION , 2011, Evolution; international journal of organic evolution.

[44]  Richard G FitzJohn,et al.  Quantitative traits and diversification. , 2010, Systematic biology.

[45]  D. Rabosky,et al.  Reinventing species selection with molecular phylogenies. , 2010, Trends in ecology & evolution.

[46]  Richard G FitzJohn,et al.  Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. , 2009, Systematic biology.

[47]  M. Vences,et al.  Vast underestimation of Madagascar's biodiversity evidenced by an integrative amphibian inventory , 2009, Proceedings of the National Academy of Sciences.

[48]  M. Bidartondo,et al.  How to know unknown fungi: the role of a herbarium. , 2009, The New phytologist.

[49]  David Jablonski,et al.  Species Selection: Theory and Data , 2008 .

[50]  Ben Collen,et al.  The Tropical Biodiversity Data Gap: Addressing Disparity in Global Monitoring , 2008 .

[51]  D. Hillis,et al.  Taxon sampling and the accuracy of phylogenetic analyses , 2008 .

[52]  R. Ricklefs,et al.  Estimating diversification rates from phylogenetic information. , 2007, Trends in ecology & evolution.

[53]  Peter E Midford,et al.  Estimating a binary character's effect on speciation and extinction. , 2007, Systematic biology.

[54]  Tony O’Hagan Bayes factors , 2006 .

[55]  Daniel E. Moerman,et al.  The botanist effect: counties with maximal species richness tend to be home to universities and botanists , 2006 .

[56]  D. Crowley,et al.  Bacterial Diversity in Tree Canopies of the Atlantic Forest , 2006, Science.

[57]  Daniel L Rabosky,et al.  LIKELIHOOD METHODS FOR DETECTING TEMPORAL SHIFTS IN DIVERSIFICATION RATES , 2006, Evolution; international journal of organic evolution.

[58]  D. J. Lodge,et al.  Global diversity and distribution of macrofungi , 2006, Biodiversity and Conservation.

[59]  Gilles Fleury,et al.  Smoothing parameter selection in nonparametric regression using an improved kullback information criterion , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..

[60]  Olivier François,et al.  On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited. , 2005, Mathematical biosciences.

[61]  E. Wagenmakers,et al.  AIC model selection using Akaike weights , 2004, Psychonomic bulletin & review.

[62]  A. A. Chek,et al.  Why is there a tropical-temperate disparity in the genetic diversity and taxonomy of species? , 2003 .

[63]  M. Donoghue,et al.  Phylogenetic Uncertainties and Sensitivity Analyses in Comparative Biology , 1996 .

[64]  R M May,et al.  The reconstructed evolutionary process. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[65]  N. Pace,et al.  Remarkable archaeal diversity detected in a Yellowstone National Park hot spring environment. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Karel F. Liem,et al.  Evolutionary Strategies and Morphological Innovations: Cichlid Pharyngeal Jaws , 1973 .

[67]  M. J. Sackin,et al.  “Good” and “Bad” Phenograms , 1972 .