Gene tree discordance causes apparent substitution rate variation

Substitution rates are known to be variable among genes, chromosomes, species, and lineages due to multifarious biological processes. Here, we consider another source of substitution rate variation due to a technical bias associated with gene tree discordance. Discordance has been found to be rampant in genome-wide data sets, often due to incomplete lineage sorting (ILS). This apparent substitution rate variation is caused when substitutions that occur on discordant gene trees are analyzed in the context of a single, fixed species tree. Such substitutions have to be resolved by proposing multiple substitutions on the species tree, and we therefore refer to this phenomenon as Substitutions Produced by ILS (SPILS). We use simulations to demonstrate that SPILS has a larger effect with increasing levels of ILS, and on trees with larger numbers of taxa. Specific branches of the species trees are consistently, but erroneously, inferred to be longer or shorter, and we show that these branches can be predicted based on discordant tree topologies. Moreover, we observe that fixing a species tree topology when performing tests of positive selection increases the false positive rate, particularly for genes whose discordant topologies are most affected by SPILS. Finally, we use data from multiple Drosophila species to show that SPILS can be detected in nature. Although the effects of SPILS are modest per gene, it has the potential to affect substitution rate variation whenever high levels of ILS are present, particularly in rapid radiations. The problems outlined here have implications for character mapping of any type of trait, and for any biological process that causes discordance. We discuss possible solutions to these problems, and areas in which they are likely to have caused faulty inferences of convergence and accelerated evolution.

[1]  Matthew W. Hahn,et al.  Irrational exuberance for resolved species trees , 2016, Evolution; international journal of organic evolution.

[2]  Mario dos Reis,et al.  The impact of ancestral population size and incomplete lineage sorting on Bayesian estimation of species divergence times , 2015 .

[3]  Ziheng Yang The BPP program for species tree estimation and species delimitation , 2015 .

[4]  R. Lanfear,et al.  Phylogenetic uncertainty can bias the number of evolutionary transitions estimated from ancestral state reconstruction methods. , 2015, Journal of experimental zoology. Part B, Molecular and developmental evolution.

[5]  H. Ellegren,et al.  The Dynamics of Incomplete Lineage Sorting across the Ancient Adaptive Radiation of Neoavian Birds , 2015, PLoS biology.

[6]  Tandy J. Warnow,et al.  ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes , 2015, Bioinform..

[7]  M. Grabherr,et al.  Evolution of Darwin’s finches and their beaks revealed by genome sequencing , 2015, Nature.

[8]  A. Møller,et al.  Genetic variation in birds in relation to predation risk by hawks: A comparative analysis , 2015 .

[9]  Andrew Balmford,et al.  Walk on the Wild Side: Estimating the Global Magnitude of Visits to Protected Areas , 2015, PLoS biology.

[10]  Laura M. Jackson,et al.  Finding Our Way through Phenotypes , 2015, PLoS biology.

[11]  Md. Shamsuzzoha Bayzid,et al.  Whole-genome analyses resolve early branches in the tree of life of modern birds , 2014, Science.

[12]  Andreas R. Pfenning,et al.  Comparative genomics reveals insights into avian genome evolution and adaptation , 2014, Science.

[13]  T. Sicheritz-Pontén,et al.  Speciation with gene flow in equids despite extensive chromosomal plasticity , 2014, Proceedings of the National Academy of Sciences.

[14]  John Gatesy,et al.  Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum. , 2014, Molecular phylogenetics and evolution.

[15]  Eric S. Lander,et al.  The genomic substrate for adaptive radiation in African cichlid fish , 2014, Nature.

[16]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[17]  P. Provero,et al.  Genome-wide signatures of convergent evolution in echolocating mammals , 2013, Nature.

[18]  J. Pease,et al.  MORE ACCURATE PHYLOGENIES INFERRED FROM LOW‐RECOMBINATION REGIONS IN THE PRESENCE OF INCOMPLETE LINEAGE SORTING , 2013, Evolution; international journal of organic evolution.

[19]  Mira V. Han,et al.  Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. , 2013, Molecular biology and evolution.

[20]  J. Oliver MICROEVOLUTIONARY PROCESSES GENERATE PHYLOGENOMIC DISCORDANCE AT ANCIENT DIVERGENCES , 2013, Evolution; international journal of organic evolution.

[21]  Martin Kircher,et al.  Comparative population genomics of the ejaculate in humans and the great apes. , 2013, Molecular biology and evolution.

[22]  Dannie Durand,et al.  Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees , 2012, Bioinform..

[23]  Manolis Kellis,et al.  Unified modeling of gene duplication, loss, and coalescence using a locus tree. , 2012, Genome research.

[24]  Albert J. Vilella,et al.  Insights into hominid evolution from the gorilla genome sequence , 2012, Nature.

[25]  David Bryant,et al.  Next-generation sequencing reveals phylogeographic structure and a species tree for recent bird divergences. , 2009, Molecular phylogenetics and evolution.

[26]  M. Siol,et al.  EggLib: processing, analysis and simulation tools for population genetics and genomics , 2012, BMC Genetics.

[27]  S. Yip,et al.  UPDG: Utilities package for data analysis of Pooled DNA GWAS , 2012, BMC Genetics.

[28]  M. Pagel,et al.  DO VARIATIONS IN SUBSTITUTION RATES AND MALE MUTATION BIAS CORRELATE WITH LIFE‐HISTORY TRAITS? A STUDY OF 32 MAMMALIAN GENOMES , 2011, Evolution; international journal of organic evolution.

[29]  Colin N. Dewey,et al.  BUCKy: Gene tree/species tree reconciliation with Bayesian concordance analysis , 2010, Bioinform..

[30]  R. Lanfear,et al.  Watching the clock: studying variation in rates of molecular evolution between species. , 2010, Trends in ecology & evolution.

[31]  M. Lynch Evolution of the mutation rate. , 2010, Trends in genetics : TIG.

[32]  D. Adams,et al.  Ontogenetic convergence and evolution of foot morphology in European cave salamanders (Family: Plethodontidae) , 2010, BMC Evolutionary Biology.

[33]  M. Pagel,et al.  Speciation as an active force in promoting genetic evolution. , 2010, Trends in ecology & evolution.

[34]  Scott V Edwards,et al.  A maximum pseudo-likelihood approach for estimating species trees under the coalescent model , 2010, BMC Evolutionary Biology.

[35]  E. Armbrust,et al.  Genome size differentiates co-occurring populations of the planktonic diatom Ditylum brightwellii (Bacillariophyta) , 2010, BMC Evolutionary Biology.

[36]  D. Pearl,et al.  Estimating species phylogenies using coalescence times among sequences. , 2009, Systematic biology.

[37]  L. Bromham Why do species vary in their rate of molecular evolution? , 2009, Biology Letters.

[38]  M. Donoghue,et al.  Rates of Molecular Evolution Are Linked to Life History in Flowering Plants , 2008, Science.

[39]  L. Duret,et al.  Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes. , 2008, Genome research.

[40]  T. J. Robinson,et al.  Hemiplasy: a new term in the lexicon of phylogenetics. , 2008, Systematic biology.

[41]  Alex Wong,et al.  Evolution of protein-coding genes in Drosophila. , 2008, Trends in genetics : TIG.

[42]  Dannie Durand,et al.  Reconciliation with non-binary species trees. , 2008, Journal of computational biology : a journal of computational molecular cell biology.

[43]  A. Rambaut,et al.  BEAST: Bayesian evolutionary analysis by sampling trees , 2007, BMC Evolutionary Biology.

[44]  Colin N. Dewey,et al.  Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans , 2007, PLoS biology.

[45]  Michael S. Y. Lee,et al.  THE LIKELIHOOD NODE DENSITY EFFECT AND CONSEQUENCES FOR EVOLUTIONARY STUDIES OF MOLECULAR RATES , 2007, Evolution; international journal of organic evolution.

[46]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[47]  Matthew W. Hahn,et al.  Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution , 2007, Genome Biology.

[48]  D. Pearl,et al.  High-resolution species trees without concatenation , 2007, Proceedings of the National Academy of Sciences.

[49]  L. Kubatko,et al.  Inconsistency of phylogenetic estimates from concatenated data under coalescence. , 2007, Systematic biology.

[50]  A. Green,et al.  DNA Damage–Induced Bcl-xL Deamidation Is Mediated by NHE-1 Antiport Regulated Intracellular pH , 2006, PLoS biology.

[51]  Csaba Pal,et al.  Differential impact of simultaneous migration on coevolving hosts and parasites , 2007, BMC Evolutionary Biology.

[52]  M. Pagel,et al.  Large Punctuational Contribution of Speciation to Evolutionary Divergence at the Molecular Level , 2006, Science.

[53]  Alan M. Moses,et al.  Widespread Discordance of Gene Trees with Species Tree in Drosophila: Evidence for Incomplete Lineage Sorting , 2006, PLoS genetics.

[54]  L. Gillman,et al.  The road from Santa Rosalia: A faster tempo of evolution in tropical climates , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[55]  N. Rosenberg,et al.  Discordance of Species Trees with Their Most Likely Gene Trees , 2006, PLoS genetics.

[56]  Jean L. Chang,et al.  Initial sequence of the chimpanzee genome and comparison with the human genome , 2005, Nature.

[57]  James H. Degnan,et al.  GENE TREE DISTRIBUTIONS UNDER THE COALESCENT PROCESS , 2005, Evolution; international journal of organic evolution.

[58]  A. Rambaut,et al.  Determinants of rate variation in mammalian DNA sequence evolution , 1996, Journal of Molecular Evolution.

[59]  Charles H. Langley,et al.  Are evolutionary rates really variable? , 1979, Journal of Molecular Evolution.

[60]  T. Ohta Population size and rate of evolution , 1972, Journal of Molecular Evolution.

[61]  C. Witt,et al.  Comment on "Molecular Phylogenies Link Rates of Evolution and Speciation" (I) , 2004, Science.

[62]  Bruce T Lahn,et al.  Genic mutation rates in mammals: local similarity, chromosomal heterogeneity, and X-versus-autosome disparity. , 2003, Molecular biology and evolution.

[63]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[64]  Ziheng Yang,et al.  Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. , 2003, Genetics.

[65]  Mark Pagel,et al.  Molecular Phylogenies Link Rates of Evolution and Speciation , 2003, Science.

[66]  S. Pääbo,et al.  A neutral explanation for the correlation of diversity with recombination rates in humans. , 2003, American journal of human genetics.

[67]  Carsten Schwarz,et al.  Genomewide comparison of DNA sequences between humans and chimpanzees. , 2002, American journal of human genetics.

[68]  Kateryna D. Makova,et al.  Strong male-driven evolution of DNA sequences in humans and apes , 2002, Nature.

[69]  V. Savolainen,et al.  EVOLUTIONARY RATES AND SPECIES DIVERSITY IN FLOWERING PLANTS , 2001, Evolution; international journal of organic evolution.

[70]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[71]  D. Hewett‐Emmett,et al.  Weak male-driven molecular evolution in rodents. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[72]  Andrew P. Martin,et al.  Body size, metabolic rate, generation time, and the molecular clock. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[73]  Wen-Hsiung Li,et al.  Male-driven evolution of DNA sequences , 1993, Nature.

[74]  D. Labie,et al.  Molecular Evolution , 1991, Nature.

[75]  M. Nei,et al.  Relationships between gene trees and species trees. , 1988, Molecular biology and evolution.

[76]  K. Kuma,et al.  Male-driven molecular evolution: a model and nucleotide sequence analysis. , 1987, Cold Spring Harbor symposia on quantitative biology.

[77]  W. Li,et al.  Evidence for higher rates of nucleotide substitution in rodents than in man. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[78]  D. Robinson,et al.  Comparison of phylogenetic trees , 1981 .

[79]  C. Laird,et al.  Rate of Fixation of Nucleotide Substitutions in Evolution , 1969, Nature.