Comparison of likelihood and Bayesian methods for estimating divergence times using multiple gene Loci and calibration points, with application to a radiation of cute-looking mouse lemur species.

Divergence time and substitution rate are seriously confounded in phylogenetic analysis, making it difficult to estimate divergence times when the molecular clock (rate constancy among lineages) is violated. This problem can be alleviated to some extent by analyzing multiple gene loci simultaneously and by using multiple calibration points. While different genes may have different patterns of evolutionary rate change, they share the same divergence times. Indeed, the fact that each gene may violate the molecular clock differently leads to the advantage of simultaneous analysis of multiple loci. Multiple calibration points provide the means for characterizing the local evolutionary rates on the phylogeny. In this paper, we extend previous likelihood models of local molecular clock for estimating species divergence times to accommodate multiple calibration points and multiple genes. Heterogeneity among different genes in evolutionary rate and in substitution process is accounted for by the models. We apply the likelihood models to analyze two mitochondrial protein-coding genes, cytochrome oxidase II and cytochrome b, to estimate divergence times of Malagasy mouse lemurs and related outgroups. The likelihood method is compared with the Bayes method of Thorne et al. (1998, Mol. Biol. Evol. 15:1647-1657), which uses a probabilistic model to describe the change in evolutionary rate over time and uses the Markov chain Monte Carlo procedure to derive the posterior distribution of rates and times. Our likelihood implementation has the drawbacks of failing to accommodate uncertainties in fossil calibrations and of requiring the researcher to classify branches on the tree into different rate groups. Both problems are avoided in the Bayes method. Despite the differences in the two methods, however, data partitions and model assumptions had the greatest impact on date estimation. The three codon positions have very different substitution rates and evolutionary dynamics, and assumptions in the substitution model affect date estimation in both likelihood and Bayes analyses. The results demonstrate that the separate analysis is unreliable, with dates variable among codon positions and between methods, and that the combined analysis is much more reliable. When the three codon positions were analyzed simultaneously under the most realistic models using all available calibration information, the two methods produced similar results. The divergence of the mouse lemurs is dated to be around 7-10 million years ago, indicating a surprisingly early species radiation for such a morphologically uniform group of primates.

[1]  P. Lio’,et al.  Molecular phylogenetics: state-of-the-art methods for looking into the past. , 2001, Trends in genetics : TIG.

[2]  Z. Yang,et al.  Estimation of primate speciation dates using local molecular clocks. , 2000, Molecular biology and evolution.

[3]  H Kishino,et al.  Converting distance to time: application to human evolution. , 1990, Methods in enzymology.

[4]  P. Gingerich,et al.  Time of origin of primates , 1994 .

[5]  S. Tavaré,et al.  Using the fossil record to estimate the age of the last common ancestor of extant primates , 2002, Nature.

[6]  Andrew Rambaut,et al.  Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies , 2000, Bioinform..

[7]  C. Groves,et al.  Primate phylogeny: morphological vs. molecular results. , 1996, Molecular phylogenetics and evolution.

[8]  Jörg U. Ganzhorn,et al.  Taxonomic Revision of Mouse Lemurs (Microcebus) in the Western Portions of Madagascar , 2000, International Journal of Primatology.

[9]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[10]  D. Swofford,et al.  Should we use model-based methods for phylogenetic inference when we know that assumptions about among-site rate variation and nucleotide substitution pattern are violated? , 2001, Systematic biology.

[11]  W. Bruno,et al.  Performance of a divergence time estimation method under a probabilistic model of rate evolution. , 2001, Molecular biology and evolution.

[12]  A. Yoder,et al.  Remarkable species diversity in Malagasy mouse lemurs (primates, Microcebus). , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[13]  S. Jeffery Evolution of Protein Molecules , 1979 .

[14]  Adoum H. Mahamat,et al.  A new hominid from the Upper Miocene of Chad, Central Africa , 2002, Nature.

[15]  J. Thewissen Phylogenetic aspects of Cetacean origins: A morphological perspective , 1994, Journal of Mammalian Evolution.

[16]  R. Martin Primate origins: plugging the gaps , 1993, Nature.

[17]  J. Huelsenbeck,et al.  A compound poisson process for relaxing the molecular clock. , 2000, Genetics.

[18]  Z. Yang,et al.  Among-site rate variation and its impact on phylogenetic analyses. , 1996, Trends in ecology & evolution.

[19]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[20]  Effrey,et al.  Divergence Time and Evolutionary Rate Estimation with Multilocus Data , 2002 .

[21]  H. Kishino,et al.  Estimating the rate of evolution of the rate of molecular evolution. , 1998, Molecular biology and evolution.

[22]  B. MacFadden,et al.  Perissodactyla and Proboscidea , 1998 .

[23]  D. Prothero,et al.  The Evolution of perissodactyls , 1989 .

[24]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[25]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[26]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[27]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[28]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[29]  M. Sanderson Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. , 2002, Molecular biology and evolution.

[30]  S. O’Brien,et al.  Placental mammal diversification and the Cretaceous–Tertiary boundary , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Stéphane Aris-Brosou,et al.  Effects of models of rate evolution on estimation of divergence dates with special reference to the metazoan 18S ribosomal RNA phylogeny. , 2002, Systematic biology.

[32]  Michael J. Sanderson,et al.  A Nonparametric Approach to Estimating Divergence Times in the Absence of Rate Constancy , 1997 .

[33]  A. Rodrigo,et al.  The inference of stepwise changes in substitution rates using serial sequence samples. , 2001, Molecular biology and evolution.

[34]  N. Goldman,et al.  Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. , 1994, Molecular biology and evolution.

[35]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[36]  S. Kumar,et al.  Patterns of nucleotide substitution in mitochondrial protein coding genes of vertebrates. , 1996, Genetics.

[37]  A Rzhetsky,et al.  Phylogenetic test of the molecular clock and linearized trees. , 1995, Molecular biology and evolution.

[38]  Kathleen M. Scott,et al.  Evolution of Tertiary Mammals of North America , 1998 .

[39]  A. Rambaut,et al.  Estimating divergence dates from molecular sequences. , 1998, Molecular biology and evolution.