Continuous and tractable models for the variation of evolutionary rates.

We propose a continuous model for variation in the evolutionary rate across sites and over the phylogenetic tree. We derive exact transition probabilities of substitutions under this model. Changes in rate are modelled using the CIR process, a diffusion widely used in financial applications. The model directly extends the standard gamma distributed rates across site model, with one additional parameter governing changes in rate down the tree. The parameters of the model can be estimated directly from two well-known statistics: the index of dispersion and the gamma shape parameter of the rates across sites model. The CIR model can be readily incorporated into probabilistic models for sequence evolution. We provide here an exact formula for the likelihood of a three-taxon tree. The likelihoods of larger trees can be evaluated using Monte-Carlo methods.

[1]  Claudio Albanese,et al.  Laplace Transforms for Integrals of Markov Processes , 2007, 0710.1599.

[2]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[3]  D. Cutler,et al.  Understanding the overdispersed molecular clock. , 2000, Genetics.

[4]  T. Ohta,et al.  Theoretical study of near neutrality. I. Heterozygosity and rate of mutant substitution. , 1990, Genetics.

[5]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[6]  B. Øksendal Stochastic differential equations : an introduction with applications , 1987 .

[7]  Y. Iwasa Overdispersed molecular evolution in constant environments. , 1993, Journal of theoretical biology.

[8]  Stéphane Aris-Brosou,et al.  Effects of models of rate evolution on estimation of divergence dates with special reference to the metazoan 18S ribosomal RNA phylogeny. , 2002, Systematic biology.

[9]  Michael J. Sanderson,et al.  A Nonparametric Approach to Estimating Divergence Times in the Absence of Rate Constancy , 1997 .

[10]  D. Cutler The index of dispersion of molecular evolution: slow fluctuations. , 2000, Theoretical population biology.

[11]  S. Ross,et al.  A theory of the term structure of interest rates'', Econometrica 53, 385-407 , 1985 .

[12]  M. Steel,et al.  Modeling the covarion hypothesis of nucleotide substitution. , 1998, Mathematical biosciences.

[13]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[14]  R. Britten,et al.  Rates of DNA sequence evolution differ between taxonomic groups. , 1986, Science.

[15]  Andrew P. Martin,et al.  Body size, metabolic rate, generation time, and the molecular clock. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Q Zheng,et al.  On the dispersion index of a Markovian molecular clock. , 2001, Mathematical biosciences.

[17]  T. Ohta,et al.  On the constancy of the evolutionary rate of cistrons , 2005, Journal of Molecular Evolution.

[18]  Gene H. Golub,et al.  Matrix computations , 1983 .

[19]  K. Holsinger The neutral theory of molecular evolution , 2004 .

[20]  J. Huelsenbeck,et al.  A compound poisson process for relaxing the molecular clock. , 2000, Genetics.

[21]  J. Gillespie The causes of molecular evolution , 1991 .

[22]  Stephane Aris-Brosou,et al.  Bayesian models of episodic evolution support a late precambrian explosive diversification of the Metazoa. , 2003, Molecular biology and evolution.

[23]  C. Laird,et al.  Rate of Fixation of Nucleotide Substitutions in Evolution , 1969, Nature.

[24]  Y. Inagaki,et al.  Testing for differences in rates-across-sites distributions in phylogenetic subtrees. , 2002, Molecular biology and evolution.

[25]  D. Cutler,et al.  Estimating divergence times in the presence of an overdispersed molecular clock. , 2000, Molecular biology and evolution.

[26]  S. Karlin,et al.  A second course in stochastic processes , 1981 .

[27]  M. Bulmer,et al.  Estimating the variability of substitution rates. , 1989, Genetics.

[28]  H. Kishino,et al.  Estimating the rate of evolution of the rate of molecular evolution. , 1998, Molecular biology and evolution.

[29]  Hung T. Nguyen,et al.  A course in stochastic processes , 1996 .

[30]  C. Pál,et al.  Highly expressed genes in yeast evolve slowly. , 2001, Genetics.

[31]  David T. Jones,et al.  Protein evolution with dependence among codons due to tertiary structure. , 2003, Molecular biology and evolution.

[32]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[33]  L. Chao,et al.  THE MOLECULAR CLOCK AND THE RELATIONSHIP BETWEEN POPULATION SIZE AND GENERATION TIME , 1993, Evolution; international journal of organic evolution.

[34]  Wen-Hsiung Li,et al.  Mutation rates differ among regions of the mammalian genome , 1989, Nature.

[35]  D. Owen Handbook of Mathematical Functions with Formulas , 1965 .

[36]  H. Philippe,et al.  Heterotachy, an important process of protein evolution. , 2002, Molecular biology and evolution.

[37]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[38]  Alain Jean-Marie,et al.  Markov-Modulated Markov Chains and the Covarion Process of Molecular Evolution , 2004, J. Comput. Biol..

[39]  T. Ohta,et al.  Synonymous and nonsynonymous substitutions in mammalian genes and the nearly neutral theory. , 1995, Journal of molecular evolution.

[40]  M. Yor,et al.  Continuous martingales and Brownian motion , 1990 .

[41]  W. Fitch,et al.  An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution , 1970, Biochemical Genetics.

[42]  N. Takahata,et al.  On the overdispersed molecular clock. , 1987, Genetics.

[43]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[44]  K. Mather Genetical structure of populations , 1973 .

[45]  Thomas Uzzell,et al.  Fitting Discrete Probability Distributions to Evolutionary Events , 1971, Science.

[46]  N. Galtier,et al.  Maximum-likelihood phylogenetic analysis under a covarion-like model. , 2001, Molecular biology and evolution.

[47]  J H Gillespie,et al.  Lineage effects and the index of dispersion of molecular evolution. , 1989, Molecular biology and evolution.

[48]  W. Bruno,et al.  Performance of a divergence time estimation method under a probabilistic model of rate evolution. , 2001, Molecular biology and evolution.

[49]  M. Kac On Some Connections between Probability Theory and Differential and Integral Equations , 1951 .

[50]  S. Shreve Stochastic calculus for finance , 2004 .