The influence of rate heterogeneity among sites on the time dependence of molecular rates.

Molecular evolutionary rate estimates have been shown to depend on the time period over which they are estimated. Factors such as demographic processes, calibration errors, purifying selection, and the heterogeneity of substitution rates among sites (RHAS) are known to affect the accuracy with which rates of evolution are estimated. We use mathematical modeling and Bayesian analyses of simulated sequence alignments to explore how mutational hotspots can lead to time-dependent rate estimates. Mathematical modeling shows that underestimation of molecular rates over increasing time scales is inevitable when RHAS is ignored. Although a gamma distribution is commonly used to model RHAS, we show that when the actual RHAS deviates from a gamma-like distribution, rates can either be under- or overestimated in a time-dependent manner. Simulations performed under different scenarios of RHAS confirm the mathematical modeling and demonstrate the impacts of time-dependent rates on estimates of divergence times. Most notably, erroneous rate estimates can have narrow credibility intervals, leading to false confidence in biased estimates of rates, and node ages. Surprisingly, large errors in estimates of overall molecular rate do not necessarily generate large errors in divergence time estimates. Finally, we illustrate the correlation between time-dependent rate patterns and differential saturation between quickly and slowly evolving sites. Our results suggest that data partitioning or simple nonparametric mixture models of RHAS significantly improve the accuracy with which node ages and substitution rates can be estimated.

[1]  R. Nichols,et al.  Dates from the molecular clock: how wrong can we be? , 2007, Trends in ecology & evolution.

[2]  J. Huelsenbeck,et al.  A compound poisson process for relaxing the molecular clock. , 2000, Genetics.

[3]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[4]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[5]  D. Ord,et al.  PAUP:Phylogenetic analysis using parsi-mony , 1993 .

[6]  S. Ho,et al.  Molecular clocks: when times are a-changin'. , 2006, Trends in genetics : TIG.

[7]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[8]  S. Ho,et al.  The crucial role of calibration in molecular date estimates for the peopling of the Americas. , 2008, American journal of human genetics.

[9]  S. Ho,et al.  The Effect of Inappropriate Calibration: Three Case Studies in Molecular Ecology , 2008, PloS one.

[10]  Nicolas Galtier,et al.  Mutation hot spots in mammalian mitochondrial DNA. , 2005, Genome research.

[11]  Christopher R. Gignoux,et al.  Characterizing the time dependency of human mitochondrial DNA mutation rate estimates. , 2008, Molecular biology and evolution.

[12]  C. Simon,et al.  Incorporating molecular evolution into phylogenetic analysis, and a new compilation of conserved polymerase chain reaction primers for animal mitochondrial DNA , 2006 .

[13]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[14]  P. Holland,et al.  Phylogenomics of eukaryotes: impact of missing data on large alignments. , 2004, Molecular biology and evolution.

[15]  H. Philippe,et al.  A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. , 2004, Molecular biology and evolution.

[16]  G. Hojunson When phylogenetic assumptions are violated : base compositional heterogeneity and among-site rate variation in beetle mitochondrial phylogenomics , 2010 .

[17]  L. Lehmann,et al.  SUBSTITUTION RATES AT NEUTRAL GENES DEPEND ON POPULATION SIZE UNDER FLUCTUATING DEMOGRAPHY AND OVERLAPPING GENERATIONS , 2012, Evolution; international journal of organic evolution.

[18]  Alexei J Drummond,et al.  Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. , 2005, Molecular biology and evolution.

[19]  S. Ho,et al.  Relaxed Phylogenetics and Dating with Confidence , 2006, PLoS biology.

[20]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[21]  S. Rosset,et al.  Maximum-Likelihood Estimation of Site-Specific Mutation Rates in Human Mitochondrial DNA From Partial Phylogenetic Classification , 2008, Genetics.

[22]  M. Woodhams Can deleterious mutations explain the time dependency of molecular rate estimates? , 2006, Molecular biology and evolution.

[23]  A. Rodrigo,et al.  Time‐dependent rates of molecular evolution , 2011, Molecular ecology.

[24]  C. Millar,et al.  High mitogenomic evolutionary rates and time dependency. , 2009, Trends in genetics : TIG.

[25]  F. Delsuc,et al.  Phylogenomics: the beginning of incongruence? , 2006, Trends in genetics : TIG.

[26]  S. Guindon,et al.  Bayesian estimation of divergence times from large sequence alignments. , 2010, Molecular biology and evolution.

[27]  Arne Röhl,et al.  Correcting for purifying selection: an improved human mitochondrial molecular clock. , 2009, American journal of human genetics.

[28]  Mark R. Wilson,et al.  A high observed substitution rate in the human mitochondrial DNA control region , 1997, Nature Genetics.

[29]  A. von Haeseler,et al.  Identifying site-specific substitution rates. , 2003, Molecular biology and evolution.

[30]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[31]  K. Kjer,et al.  Site specific rates of mitochondrial genomes and the phylogeny of eutheria , 2007, BMC Evolutionary Biology.

[32]  Sudhir Kumar,et al.  Molecular clocks: four decades of evolution , 2005, Nature Reviews Genetics.

[33]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[34]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[35]  M. Pagel,et al.  A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. , 2004, Systematic biology.

[36]  Marc A Suchard,et al.  A nonparametric method for accommodating and testing across-site rate variation. , 2007, Systematic biology.

[37]  A. Rambaut,et al.  BEAST: Bayesian evolutionary analysis by sampling trees , 2007, BMC Evolutionary Biology.

[38]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[39]  S. Ho,et al.  Evidence for time dependency of molecular rate estimates. , 2007, Systematic biology.

[40]  C. Millar,et al.  Rates of Evolution in Ancient DNA from Adélie Penguins , 2002, Science.

[41]  H. Kishino,et al.  Estimating the rate of evolution of the rate of molecular evolution. , 1998, Molecular biology and evolution.

[42]  Itay Mayrose,et al.  A Gamma mixture model better accounts for among site rate heterogeneity , 2005, ECCB/JBI.

[43]  Christopher R. Gignoux,et al.  Rapid, global demographic expansions after the origins of agriculture , 2011, Proceedings of the National Academy of Sciences.

[44]  D. Turnbull,et al.  The pedigree rate of sequence divergence in the human mitochondrial genome: there is a difference between phylogenetic and pedigree rates. , 2003, American journal of human genetics.