TreeTime: Maximum-likelihood phylodynamic analysis

Abstract Mutations that accumulate in the genome of cells or viruses can be used to infer their evolutionary history. In the case of rapidly evolving organisms, genomes can reveal their detailed spatiotemporal spread. Such phylodynamic analyses are particularly useful to understand the epidemiology of rapidly evolving viral pathogens. As the number of genome sequences available for different pathogens has increased dramatically over the last years, phylodynamic analysis with traditional methods becomes challenging as these methods scale poorly with growing datasets. Here, we present TreeTime, a Python-based framework for phylodynamic analysis using an approximate Maximum Likelihood approach. TreeTime can estimate ancestral states, infer evolution models, reroot trees to maximize temporal signals, estimate molecular clock phylogenies and population size histories. The runtime of TreeTime scales linearly with dataset size.

[1]  Michael J. Sanderson,et al.  R8s: Inferring Absolute Rates of Molecular Evolution, Divergence times in the Absence of a Molecular Clock , 2003, Bioinform..

[2]  R. Neher Genetic Draft, Selective Interference, and Population Genetics of Rapid Adaptation , 2013, 1302.1148.

[3]  Trevor Bedford,et al.  Virus genomes reveal factors that spread and sustained the Ebola epidemic , 2017, Nature.

[4]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[5]  D. Jongsomjit,et al.  Unique genome organization of non-mammalian papillomaviruses provides insights into the evolution of viral early proteins , 2017, Virus evolution.

[6]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[7]  Erik M. Volz,et al.  Scalable relaxed clock phylogenetic dating , 2017 .

[8]  M. Nordborg Structured coalescent processes on different time scales. , 1997, Genetics.

[9]  H. Kishino,et al.  Estimation of branching dates among primates by molecular clocks of nuclear DNA which slowed down in Hominoidea , 1989 .

[10]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[11]  Z. Yang,et al.  Estimation of primate speciation dates using local molecular clocks. , 2000, Molecular biology and evolution.

[12]  C. J-F,et al.  THE COALESCENT , 1980 .

[13]  Andrew Rambaut,et al.  Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies , 2000, Bioinform..

[14]  K. Strimmer,et al.  Exploring the demographic history of DNA sequences using the generalized skyline plot. , 2001, Molecular biology and evolution.

[15]  Olivier Gascuel,et al.  Fast Dating Using Least-Squares Criteria and Algorithms , 2015, Systematic biology.

[16]  N. U. Prabhu,et al.  Stochastic Processes and Their Applications , 1999 .

[17]  Trevor Bedford,et al.  nextflu: real-time tracking of seasonal influenza virus evolution in humans , 2015, Bioinform..

[18]  T. Britton,et al.  Estimating divergence times in large phylogenetic trees. , 2007, Systematic biology.

[19]  Andrew Rambaut,et al.  Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen) , 2016, Virus evolution.

[20]  H. Kishino,et al.  Estimating the rate of evolution of the rate of molecular evolution. , 1998, Molecular biology and evolution.

[21]  Koichiro Tamura,et al.  Estimating divergence times in large molecular phylogenies , 2012, Proceedings of the National Academy of Sciences.

[22]  Stéphane Aris-Brosou,et al.  Effects of models of rate evolution on estimation of divergence dates with special reference to the metazoan 18S ribosomal RNA phylogeny. , 2002, Systematic biology.

[23]  Andrew Rambaut,et al.  Real-time digital pathogen surveillance — the time is now , 2015, Genome Biology.

[24]  M. Suchard,et al.  Bayesian Phylogenetics with BEAUti and the BEAST 1.7 , 2012, Molecular biology and evolution.

[25]  S. Ho,et al.  Relaxed Phylogenetics and Dating with Confidence , 2006, PLoS biology.

[26]  S. Kak Information, physics, and computation , 1996 .

[27]  Sudhir Kumar,et al.  Advances in Time Estimation Methods for Molecular Data. , 2016, Molecular biology and evolution.

[28]  Trevor Bedford,et al.  Viral Phylodynamics , 2013, PLoS Comput. Biol..

[29]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[30]  Fabio Zanini,et al.  FFPopSim: an efficient forward simulation package for the evolution of large populations , 2012, Bioinform..

[31]  Charles H. Langley,et al.  An examination of the constancy of the rate of molecular evolution , 2005, Journal of Molecular Evolution.

[32]  Sergei L. Kosakovsky Pond,et al.  HyPhy: hypothesis testing using phylogenies , 2005, Bioinform..

[33]  R. Shamir,et al.  A fast algorithm for joint reconstruction of ancestral amino acid sequences. , 2000, Molecular biology and evolution.

[34]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .