Disentangling the effects of selection and loss bias on gene dynamics

Significance Evolution of microbes is dominated by horizontal gene transfer and the incessant host–parasite arms race that promotes the evolution of diverse antiparasite defense systems. The evolutionary factors governing these processes are complex and difficult to disentangle, but rapidly growing genome databases provide ample material for testing evolutionary models. Rigorous mathematical modeling of evolutionary processes, combined with computer simulation and comparative genomics, allowed us to elucidate the evolutionary regimes of different classes of microbial genes. Only genes involved in key informational and metabolic pathways are subject to strong selection, whereas most of the others are effectively neutral or even burdensome. Mobile genetic elements and defense systems are costly, supporting the understanding that their evolution is governed by the same factors. We combine mathematical modeling of genome evolution with comparative analysis of prokaryotic genomes to estimate the relative contributions of selection and intrinsic loss bias to the evolution of different functional classes of genes and mobile genetic elements (MGE). An exact solution for the dynamics of gene family size was obtained under a linear duplication–transfer–loss model with selection. With the exception of genes involved in information processing, particularly translation, which are maintained by strong selection, the average selection coefficient for most nonparasitic genes is low albeit positive, compatible with observed positive correlation between genome size and effective population size. Free-living microbes evolve under stronger selection for gene retention than parasites. Different classes of MGE show a broad range of fitness effects, from the nearly neutral transposons to prophages, which are actively eliminated by selection. Genes involved in antiparasite defense, on average, incur a fitness cost to the host that is at least as high as the cost of plasmids. This cost is probably due to the adverse effects of autoimmunity and curtailment of horizontal gene transfer caused by the defense systems and selfish behavior of some of these systems, such as toxin–antitoxin and restriction modification modules. Transposons follow a biphasic dynamics, with bursts of gene proliferation followed by decay in the copy number that is quantitatively captured by the model. The horizontal gene transfer to loss ratio, but not duplication to loss ratio, correlates with genome size, potentially explaining increased abundance of neutral and costly elements in larger genomes.

[1]  J. Mellor,et al.  Is H3K4me3 instructive for transcription activation? , 2017, BioEssays : news and reviews in molecular, cellular and developmental biology.

[2]  E. Koonin,et al.  Coupling immunity and programmed cell suicide in prokaryotes: Life‐or‐death choices , 2016, BioEssays : news and reviews in molecular, cellular and developmental biology.

[3]  Eugene V. Koonin,et al.  ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation , 2016, Nucleic Acids Res..

[4]  E. Koonin,et al.  Two fundamentally different classes of microbial genes , 2016, Nature Microbiology.

[5]  Michael Lynch,et al.  Genetic drift, selection and the evolution of the mutation rate , 2016, Nature Reviews Genetics.

[6]  Eugene V. Koonin,et al.  Theory of prokaryotic genome evolution , 2016, Proceedings of the National Academy of Sciences.

[7]  E. Koonin,et al.  Inevitability of Genetic Parasites , 2016, Genome biology and evolution.

[8]  M. Lynch,et al.  Genome-Wide Biases in the Rate and Molecular Spectrum of Spontaneous Mutations in Vibrio cholerae and Vibrio fischeri , 2016, bioRxiv.

[9]  Michael Lynch,et al.  Evolution of the Insertion-Deletion Mutation Rate Across the Tree of Life , 2016, G3: Genes, Genomes, Genetics.

[10]  M. Touchon,et al.  Genetic and life-history traits associated with the distribution of prophages in bacteria , 2016, The ISME Journal.

[11]  R. Kassen,et al.  The properties of spontaneous mutations in the opportunistic pathogen Pseudomonas aeruginosa , 2016, BMC Genomics.

[12]  Erik Kaestner,et al.  The Origins Of Genome Architecture , 2016 .

[13]  M. Lynch,et al.  The bioenergetic costs of a gene , 2015, Proceedings of the National Academy of Sciences.

[14]  R. Veitia,et al.  Gene dosage imbalances: action, reaction, and models. , 2015, Trends in biochemical sciences.

[15]  M. Lynch,et al.  Background Mutational Features of the Radiation-Resistant Bacterium Deinococcus radiodurans. , 2015, Molecular biology and evolution.

[16]  M. Lynch,et al.  The Rate and Molecular Spectrum of Spontaneous Mutations in the GC-Rich Multichromosome Genome of Burkholderia cenocepacia , 2015, Genetics.

[17]  E. Koonin,et al.  Classification of prokaryotic genetic replicators: between selfishness and altruism , 2015, Annals of the New York Academy of Sciences.

[18]  E. Koonin,et al.  Immunity, suicide or both? Ecological determinants for the combined evolution of anti-pathogen defense systems , 2015, BMC Evolutionary Biology.

[19]  M. Lynch,et al.  Asymmetric Context-Dependent Mutation Patterns Revealed through Mutation-Accumulation Experiments. , 2015, Molecular biology and evolution.

[20]  Michael Y. Galperin,et al.  Expanded microbial genome coverage and improved protein family annotation in the COG database , 2014, Nucleic Acids Res..

[21]  N. Moran,et al.  The tiniest tiny genomes. , 2014, Annual review of microbiology.

[22]  E. Koonin,et al.  Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes , 2014, BMC Biology.

[23]  Susanna C. Manrubia,et al.  Large-Scale Genomic Analysis Suggests a Neutral Punctuated Dynamics of Transposable Elements in Bacterial Genomes , 2014, PLoS Comput. Biol..

[24]  L. Wahl,et al.  Birth, Death, and Diversification of Mobile Promoters in Prokaryotes , 2014, Genetics.

[25]  L. Wahl,et al.  Rates of transposition in Escherichia coli , 2013, Biology Letters.

[26]  Nan Qin,et al.  Extraordinary expansion of a Sorangium cellulosum genome from an alkaline milieu , 2013, Scientific Reports.

[27]  Eugene V. Koonin,et al.  Gene Frequency Distributions Reject a Neutral Model of Genome Evolution , 2013, Genome biology and evolution.

[28]  I. Kobayashi,et al.  Restriction-Modification Systems as Mobile Epigenetic Elements , 2013 .

[29]  A. Barbour,et al.  Estimating the fitness effect of an insertion sequence , 2013, Journal of mathematical biology.

[30]  Ariel D. Weinberger,et al.  Viral Diversity Threshold for Adaptive Immunity in Prokaryotes , 2012, mBio.

[31]  Abraham E. Tucker,et al.  Extraordinary genome stability in the ciliate Paramecium tetraurelia , 2012, Proceedings of the National Academy of Sciences.

[32]  Thomas G. Doak,et al.  Drift-barrier hypothesis and mutation-rate evolution , 2012, Proceedings of the National Academy of Sciences.

[33]  A. Stoltzfus,et al.  Population Diversity of ORFan Genes in Escherichia coli , 2012, Genome biology and evolution.

[34]  Eugene V. Koonin,et al.  Evolution of microbes and viruses: a paradigm shift in evolutionary biology? , 2012, Front. Cell. Inf. Microbio..

[35]  O. Berg,et al.  Selection-Driven Gene Loss in Bacteria , 2012, PLoS genetics.

[36]  Ming-Chun Lee,et al.  Repeated, Selection-Driven Genome Reduction of Accessory Genes in Experimental Populations , 2012, PLoS genetics.

[37]  R. Cordaux,et al.  Short- and Long-term Evolutionary Dynamics of Bacterial Insertion Sequences: Insights from Wolbachia Endosymbionts , 2011, Genome biology and evolution.

[38]  E. Koonin The Logic of Chance: The Nature and Origin of Biological Evolution , 2011 .

[39]  W. Nelson,et al.  Analysis of Insertion Sequences in Thermophilic Cyanobacteria: Exploring the Mechanisms of Establishing, Maintaining, and Withstanding High Insertion Sequence Abundance , 2011, Applied and Environmental Microbiology.

[40]  Shiraz A. Shah,et al.  CRISPR-based immune systems of the Sulfolobales: complexity and diversity. , 2011, Biochemical Society transactions.

[41]  Howard Ochman,et al.  The Extinction Dynamics of Bacterial Pseudogenes , 2010, PLoS genetics.

[42]  Miklós Csuös,et al.  Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood , 2010, Bioinform..

[43]  Lavanya Kannan,et al.  A low-polynomial algorithm for assembling clusters of orthologous groups from intergenomic symmetric best matches , 2010, Bioinform..

[44]  F. Kondrashov,et al.  The evolution of gene duplications: classifying and distinguishing between models , 2010, Nature Reviews Genetics.

[45]  L. Van Melderen Toxin-antitoxin systems: why so many, what for? , 2010, Current opinion in microbiology.

[46]  Howard Ochman,et al.  The consequences of genetic drift for bacterial genome complexity. , 2009, Genome research.

[47]  Howard Ochman,et al.  Deletional Bias across the Three Domains of Life , 2009, Genome biology and evolution.

[48]  L. Melderen,et al.  Bacterial toxin-antitoxin systems: more than selfish entities? , 2009 .

[49]  L. Van Melderen,et al.  Bacterial Toxin–Antitoxin Systems: More Than Selfish Entities? , 2009, PLoS genetics.

[50]  Inna Dubchak,et al.  Trends in Prokaryotic Evolution Revealed by Comparison of Closely Related Bacterial and Archaeal Genomes , 2008, Journal of bacteriology.

[51]  Inna Dubchak,et al.  ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes , 2008, Nucleic Acids Res..

[52]  Kelly P. Williams,et al.  Comparative Genomics Reveal Extensive Transposon-Mediated Genomic Plasticity and Diversity among Potential Effector Proteins within the Genus Coxiella , 2008, Infection and Immunity.

[53]  J. Plotkin,et al.  The Population Genetics of dN/dS , 2008, PLoS genetics.

[54]  E. Koonin,et al.  Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world , 2008, Nucleic acids research.

[55]  David R. Riley,et al.  Comparative genomics: the bacterial pan-genome. , 2008, Current opinion in microbiology.

[56]  Ying Xu,et al.  Insertion Sequences show diverse recent activities in Cyanobacteria and Archaea , 2008, BMC Genomics.

[57]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[58]  Eduardo P C Rocha,et al.  Causes of insertion sequences abundance in prokaryotic genomes. , 2007, Molecular biology and evolution.

[59]  P. Siguier,et al.  Insertion Sequence Diversity in Archaea , 2007, Microbiology and Molecular Biology Reviews.

[60]  Sergei Maslov,et al.  Parameters of proteome evolution from histograms of amino-acid sequence identities of paralogous proteins , 2005, Biology Direct.

[61]  Byron Gallis,et al.  Comparison of Francisella tularensis genomes reveals evolutionary events associated with the emergence of human pathogenic strains , 2007, Genome Biology.

[62]  M. Lynch Streamlining and simplification of microbial genome architecture. , 2006, Annual review of microbiology.

[63]  Miklós Csűrös,et al.  On the Estimation of Intron Evolution , 2006, PLoS computational biology.

[64]  F. Rodríguez-Valera,et al.  The Neolithic revolution of bacterial genomes. , 2006, Trends in microbiology.

[65]  István Miklós,et al.  A Probabilistic Model for Gene Content Evolution with Duplication, Loss, and Horizontal Transfer , 2005, RECOMB.

[66]  Eugene V. Koonin,et al.  Biological applications of the theory of birth-and-death processes , 2005, Briefings Bioinform..

[67]  Laura S. Frost,et al.  Mobile genetic elements: the agents of open source evolution , 2005, Nature Reviews Microbiology.

[68]  S. Eriksson,et al.  Bacterial genome size reduction by experimental evolution. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[69]  Y. Ohtsubo,et al.  High-Temperature-Induced Transposition of Insertion Elements in Burkholderia multivorans ATCC 17616 , 2005, Applied and Environmental Microbiology.

[70]  N. Moran,et al.  Genomic changes following host restriction in bacteria. , 2004, Current opinion in genetics & development.

[71]  Eugene V Koonin,et al.  Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models , 2004, BMC Evolutionary Biology.

[72]  Z. Nagy,et al.  Regulation of transposition in bacteria. , 2004, Research in microbiology.

[73]  Richard E. Lenski,et al.  Distribution of fitness effects caused by random insertion mutations in Escherichia coli , 2004, Genetica.

[74]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[75]  Eugene V. Koonin,et al.  Comparative genomics, minimal gene-sets and the last universal common ancestor , 2003, Nature Reviews Microbiology.

[76]  D. M. Krylov,et al.  Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. , 2003, Genome research.

[77]  Daniel Fischer,et al.  Unravelling the ORFan Puzzle , 2003, Comparative and functional genomics.

[78]  E. Koonin,et al.  Birth and death of protein domains: A simple model of evolution explains power law behavior , 2002, BMC Evolutionary Biology.

[79]  R. Garrett,et al.  Mobile elements in archaeal genomes. , 2002, FEMS microbiology letters.

[80]  I. Kobayashi Behavior of restriction-modification systems as selfish mobile elements and their impact on genome evolution. , 2001, Nucleic acids research.

[81]  M. Huynen,et al.  The frequency distribution of gene family sizes in complete genomes. , 1998, Molecular biology and evolution.

[82]  W. Ebeling Stochastic Processes in Physics and Chemistry , 1995 .

[83]  W Arber,et al.  Insertion sequence-related genetic variation in resting Escherichia coli K-12. , 1994, Genetics.

[84]  C. Basten,et al.  A branching-process model for the evolution of transposable elements incorporating selection , 1991, Journal of mathematical biology.

[85]  M E Moody,et al.  A branching process model for the evolution of transposable elements , 1988, Journal of mathematical biology.

[86]  C. Gardiner Handbook of Stochastic Methods , 1983 .

[87]  N. Kampen,et al.  Stochastic processes in physics and chemistry , 1981 .

[88]  J. Tukey,et al.  Variations of Box Plots , 1978 .

[89]  D. Hudson Interval Estimation from the Likelihood Function , 1971 .