A neutral theory of genome evolution and the frequency distribution of genes

BackgroundThe gene composition of bacteria of the same species can differ significantly between isolates. Variability in gene composition can be summarized in terms of gene frequency distributions, in which individual genes are ranked according to the frequency of genomes in which they appear. Empirical gene frequency distributions possess a U-shape, such that there are many rare genes, some genes of intermediate occurrence, and many common genes. It would seem that U-shaped gene frequency distributions can be used to infer the essentiality and/or importance of a gene to a species. Here, we ask: can U-shaped gene frequency distributions, instead, arise generically via neutral processes of genome evolution?ResultsWe introduce a neutral model of genome evolution which combines birth-death processes at the organismal level with gene uptake and loss at the genomic level. This model predicts that gene frequency distributions possess a characteristic U-shape even in the absence of selective forces driving genome and population structure. We compare the model predictions to empirical gene frequency distributions from 6 multiply sequenced species of bacterial pathogens. We fit the model with constant population size to data, matching U-shape distributions albeit without matching all quantitative features of the distribution. We find stronger model fits in the case where we consider exponentially growing populations. We also show that two alternative models which contain a "rigid" and "flexible" core component of genomes provide strong fits to gene frequency distributions.ConclusionsThe analysis of neutral models of genome evolution suggests that U-shaped gene frequency distributions provide less information than previously suggested regarding gene essentiality. We discuss the need for additional theory and genomic level information to disentangle the roles of evolutionary mechanisms operating within and amongst individuals in driving the dynamics of gene distributions.

[1]  T. Jukes,et al.  The neutral theory of molecular evolution. , 2000, Genetics.

[2]  M. Gerstein,et al.  Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. , 2001, Journal of molecular biology.

[3]  P. Gajer,et al.  The Pangenome Structure of Escherichia coli: Comparative Genomic Analysis of E. coli Commensal and Pathogenic Isolates , 2008, Journal of bacteriology.

[4]  Christopher R. Myers,et al.  Universally Sloppy Parameter Sensitivities in Systems Biology Models , 2007, PLoS Comput. Biol..

[5]  S. Hubbell,et al.  The unified neutral theory of biodiversity and biogeography at age ten. , 2011, Trends in ecology & evolution.

[6]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[7]  Christophe Fraser,et al.  Neutral microepidemic evolution of bacterial pathogens. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Justin S. Hogg,et al.  Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains , 2007, Genome Biology.

[9]  Pushkala Jayaraman,et al.  A computational genomics pipeline for prokaryotic sequencing projects , 2010, Bioinform..

[10]  Dominique Gravel,et al.  Reconciling niche and neutrality: the continuum hypothesis. , 2006, Ecology letters.

[11]  Robert L Charlebois,et al.  The Impact of Reticulate Evolution on Genome Phylogeny , 2008 .

[12]  K. Konstantinidis,et al.  Genomic insights that advance the species definition for prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  N. Bergman,et al.  Genomic fluidity: an integrative view of gene diversity within microbial populations , 2011, BMC Genomics.

[14]  Peter B Adler,et al.  A niche for neutrality. , 2007, Ecology letters.

[15]  H. Tettelin,et al.  Comparative genomics of Neisseria meningitidis: core genome, islands of horizontal transfer and pathogen-specific genes. , 2006, Microbiology.

[16]  S. Sampling theory for neutral alleles in a varying environment , 2003 .

[17]  Nathan J B Kraft,et al.  Functional Traits and Niche-Based Tree Community Assembly in an Amazonian Forest , 2008, Science.

[18]  H. Tettelin,et al.  Genome flexibility in Neisseria meningitidis , 2009, Vaccine.

[19]  Masatoshi Nei,et al.  The neutral theory of molecular evolution in the genomic era. , 2010, Annual review of genomics and human genetics.

[20]  Mihai Pop,et al.  Genomic characterization of the Yersinia genus , 2010, Genome Biology.

[21]  C. Kurland,et al.  Evolution of microbial genomes: sequence acquisition and loss. , 2002, Molecular biology and evolution.

[22]  R. Busing,et al.  The Unified Neutral Theory of Biodiversity and Biogeography , 2002 .

[23]  H. Ochman,et al.  Lateral gene transfer and the nature of bacterial innovation , 2000, Nature.

[24]  H. Seifert,et al.  DNA Uptake Sequence-Mediated Enhancement of Transformation in Neisseria gonorrhoeae Is Strain Dependent , 2010, Journal of bacteriology.

[25]  P. Pfaffelhuber,et al.  Evolution of bacterial genomes under horizontal gene transfer , 2011, 1105.5014.

[26]  Bruno Bassetti,et al.  Universal features in the genome-level evolution of protein domains , 2008, Genome Biology.

[27]  E. Koonin,et al.  Horizontal gene transfer in prokaryotes: quantification and classification. , 2001, Annual review of microbiology.

[28]  D. Tilman,et al.  Non-neutral patterns of species abundance in grassland communities. , 2005, Ecology letters.

[29]  C. Fields,et al.  Biogeography of the Sulfolobus islandicus pan-genome , 2009, Proceedings of the National Academy of Sciences.

[30]  W. Ewens Mathematical Population Genetics , 1980 .

[31]  M. Huynen,et al.  The frequency distribution of gene family sizes in complete genomes. , 1998, Molecular biology and evolution.

[32]  N. Goldenfeld,et al.  Global divergence of microbial genome sequences mediated by propagating fronts. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  C. Fraser,et al.  The Bacterial Species Challenge: Making Sense of Genetic and Ecological Diversity , 2009, Science.

[34]  W. Hess,et al.  The diversity of a distributed genome in bacterial populations , 2009, 0907.2572.

[35]  N. Salama,et al.  DNA Damage Triggers Genetic Exchange in Helicobacter pylori , 2010, PLoS pathogens.

[36]  J. M. Smith,et al.  How clonal are bacteria? , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Evan Powell,et al.  Comparative Genomic Analyses of Seventeen Streptococcus pneumoniae Strains: Insights into the Pneumococcal Supragenome , 2007, Journal of bacteriology.

[38]  S. Hubbell,et al.  The Unified Neutral Theory of Biodiversity and Biogeography , 2001 .

[39]  J. Wain,et al.  High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi , 2008, Nature Genetics.

[40]  Eugene V Koonin,et al.  Mathematical modeling of evolution of horizontally transferred genes. , 2005, Molecular biology and evolution.

[41]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[42]  W. Doolittle,et al.  Prokaryotic evolution in light of gene transfer. , 2002, Molecular biology and evolution.

[43]  J. Townsend,et al.  Horizontal gene transfer, genome innovation and evolution , 2005, Nature Reviews Microbiology.

[44]  Sergei Maslov,et al.  Toolbox model of evolution of prokaryotic metabolic networks and their regulation , 2009, Proceedings of the National Academy of Sciences.

[45]  Eugene V. Koonin,et al.  Are There Laws of Genome Evolution? , 2011, PLoS Comput. Biol..

[46]  E. Koonin,et al.  The structure of the protein universe and genome evolution , 2002, Nature.

[47]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Andrés Moya,et al.  Legionella pneumophila pangenome reveals strain-specific virulence factors , 2010, BMC Genomics.

[49]  Pascal Lapierre,et al.  Estimating the size of the bacterial pan-genome. , 2009, Trends in genetics : TIG.

[50]  J. Wakeley Coalescent Theory: An Introduction , 2008 .

[51]  C. Fraser,et al.  Recombination and the Nature of Bacterial Speciation , 2007, Science.