Bayesian Evaluation of Temporal Signal in Measurably Evolving Populations

Phylogenetic methods can use the sampling times of molecular sequence data to calibrate the molecular clock, enabling the estimation of evolutionary rates and timescales for rapidly evolving pathogens and data sets containing ancient DNA samples. A key aspect of such calibrations is whether a sufficient amount of molecular evolution has occurred over the sampling time window, that is, whether the data can be treated as having come from a measurably evolving population. Here we investigate the performance of a fully Bayesian evaluation of temporal signal (BETS) in sequence data. The method involves comparing the fit to the data of two models: a model in which the data are accompanied by the actual (heterochronous) sampling times, and a model in which the samples are constrained to be contemporaneous (isochronous). We conducted simulations under a wide range of conditions to demonstrate that BETS accurately classifies data sets according to whether they contain temporal signal or not, even when there is substantial among-lineage rate variation. We explore the behaviour of this classification in analyses of five empirical data sets: modern samples of A/H1N1 influenza virus, the bacterium Bordetella pertussis, coronaviruses from mammalian hosts, ancient DNA from Hepatitis B virus and mitochondrial genomes of dog species. Our results indicate that BETS is an effective alternative to other tests of temporal signal. In particular, this method has the key advantage of allowing a coherent assessment of the entire model, including the molecular clock and tree prior which are essential aspects of Bayesian phylodynamic analyses.

[1]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[2]  Sebastián Duchêne,et al.  Molecular‐clock methods for estimating evolutionary rates and timescales , 2014, Molecular ecology.

[3]  K. Holt,et al.  Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods , 2017, BMC Evolutionary Biology.

[4]  Marco A. R. Ferreira,et al.  Bayesian analysis of elapsed times in continuous‐time Markov chains , 2008 .

[5]  Jonathan P. Bollback,et al.  Bayesian Inference of Phylogeny and Its Impact on Evolutionary Biology , 2001, Science.

[6]  O. Pybus,et al.  Measurably evolving pathogens in the genomic era. , 2015, Trends in ecology & evolution.

[7]  Sebastián Duchêne,et al.  Estimating evolutionary rates using time-structured data: a general comparison of phylogenetic methods , 2016, Bioinform..

[8]  Jan M. Rabaey,et al.  Comparison of Methods , 2004 .

[9]  Lynn Kuo,et al.  Bayesian Phylogenetics : Methods, Algorithms, and Applications , 2014 .

[10]  John J. Welch,et al.  The effect of genetic structure on molecular dating and tests for temporal signal , 2015, Methods in ecology and evolution.

[11]  A. Rodrigo,et al.  Measurably evolving populations , 2003 .

[12]  S. Ho,et al.  The Impact of the Tree Prior on Molecular Dating of Data Sets Containing a Mixture of Inter‐ and Intraspecies Sampling , 2016, Systematic biology.

[13]  M. Suchard,et al.  Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty. , 2016, Systematic biology.

[14]  R. Bouckaert,et al.  Model Selection and Parameter Inference in Phylogenetics Using Nested Sampling , 2017, Systematic biology.

[15]  M. Suchard,et al.  Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. , 2012, Molecular biology and evolution.

[16]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[17]  S. Ho,et al.  Bayesian molecular dating: opening up the black box , 2018, Biological reviews of the Cambridge Philosophical Society.

[18]  D. Culler,et al.  Comparison of methods , 2000 .

[19]  S. Tavaré Some probabilistic and statistical problems in the analysis of DNA sequences , 1986 .

[20]  Sergei L. Kosakovsky Pond,et al.  A Case for the Ancient Origin of Coronaviruses , 2013, Journal of Virology.

[21]  Ziheng Yang,et al.  A biologist’s guide to Bayesian phylogenetic analysis , 2017, Nature Ecology & Evolution.

[22]  Ming-Hui Chen,et al.  Improving marginal likelihood estimation for Bayesian phylogenetic model selection. , 2011, Systematic biology.

[23]  E. Holmes,et al.  The paradox of HBV evolution as revealed from a 16th century mummy , 2018, PLoS pathogens.

[24]  V. Bryson,et al.  Evolving Genes and Proteins. , 1965, Science.

[25]  S. Sawyer,et al.  Complete Mitochondrial Genomes of Ancient Canids Suggest a European Origin of Domestic Dogs , 2013, Science.

[26]  Klaus Peter Schliep,et al.  phangorn: phylogenetic analysis in R , 2010, Bioinform..

[27]  M. Suchard,et al.  Divergence dating using mixed effects clock modelling: An application to HIV-1 , 2019, Virus evolution.

[28]  Alexei J. Drummond,et al.  A Stochastic Simulator of Birth–Death Master Equations with Application to Phylodynamics , 2013, Molecular biology and evolution.

[29]  Andrew Rambaut,et al.  Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen) , 2016, Virus evolution.

[30]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[31]  Michael Worobey,et al.  A synchronized global sweep of the internal genes of modern avian influenza virus , 2014, Nature.

[32]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[33]  H. Philippe,et al.  Computing Bayes factors using thermodynamic integration. , 2006, Systematic biology.

[34]  H. Johnson,et al.  A comparison of 'traditional' and multimedia information systems development practices , 2003, Inf. Softw. Technol..

[35]  Daniel L. Ayres,et al.  Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10 , 2018, Virus evolution.

[36]  Guy Baele,et al.  Host ecology determines the dispersal patterns of a plant virus , 2015, Virus evolution.

[37]  J. Skilling Nested sampling for general Bayesian computation , 2006 .

[38]  E. Holmes,et al.  Analyses of evolutionary dynamics in viruses are hindered by a time-dependent bias in rate estimates , 2014, Proceedings of the Royal Society B: Biological Sciences.

[39]  Vladimir N Minin,et al.  Marginal Likelihoods in Phylogenetics: A Review of Methods and Applications , 2018, Systematic biology.

[40]  L. du Plessis,et al.  Impact of the tree prior on estimating clock rates during epidemic outbreaks , 2018, Proceedings of the National Academy of Sciences.

[41]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[42]  François Balloux,et al.  Inferences from tip‐calibrated phylogenies: a review and a practical guide , 2016, Molecular ecology.

[43]  S. Ho,et al.  Relaxed Phylogenetics and Dating with Confidence , 2006, PLoS biology.

[44]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[45]  Ziheng Yang Estimating the pattern of nucleotide substitution , 1994, Journal of Molecular Evolution.

[46]  Ming-Hui Chen,et al.  Choosing among Partition Models in Bayesian Phylogenetics , 2010, Molecular biology and evolution.

[47]  Wai Lok Sibon Li,et al.  Accurate model selection of relaxed molecular clocks in bayesian phylogenetics. , 2012, Molecular biology and evolution.

[48]  A. Rambaut,et al.  Real-time characterization of the molecular epidemiology of an influenza pandemic , 2013, Biology Letters.

[49]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[50]  M. Suchard,et al.  Bayesian random local clocks, or one rate to rule them all , 2010, BMC Biology.

[51]  C. Hipsley,et al.  Beyond fossil calibrations: realities of molecular clock practices in evolutionary biology , 2014, Front. Genet..

[52]  Sebastián Duchêne,et al.  BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis , 2019, PLoS computational biology.

[53]  S. Ho,et al.  A comparison of methods for estimating substitution rates from ancient DNA sequence data , 2017, bioRxiv.

[54]  Sebastián Duchêne,et al.  The Performance of the Date-Randomization Test in Phylogenetic Analyses of Time-Structured Virus Data. , 2015, Molecular biology and evolution.

[55]  Alexei J Drummond,et al.  Choosing appropriate substitution models for the phylogenetic analysis of protein-coding sequences. , 2006, Molecular biology and evolution.

[56]  M. Quail,et al.  Global Population Structure and Evolution of Bordetella pertussis and Their Relationship with Vaccination , 2014, mBio.

[57]  David J. Edwards,et al.  Genome-scale rates of evolutionary change in bacteria , 2016, bioRxiv.

[58]  E. Holmes,et al.  Hantavirus evolution in relation to its rodent and insectivore hosts: no evidence for codivergence. , 2008, Molecular biology and evolution.

[59]  Sebastián Duchêne,et al.  BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis , 2018, bioRxiv.

[60]  Daniel J. Wilson,et al.  Bacterial Phylogenetic Reconstruction from Whole Genomes Is Robust to Recombination but Demographic Inference Is Not , 2014, mBio.

[61]  Marc A Suchard,et al.  A Bayesian phylogenetic method to estimate unknown sequence ages. , 2011, Molecular biology and evolution.

[62]  A. Lapedes,et al.  Timing the ancestor of the HIV-1 pandemic strains. , 2000, Science.