Recombination produces coherent bacterial species clusters in both core and accessory genomes

Background: Population samples show bacterial genomes can be divided into a core of ubiquitous genes and accessory genes that are present in a fraction of isolates. The ecological significance of this variation in gene content remains unclear. However, microbiologists agree that a bacterial species should be ‘genomically coherent’, even though there is no consensus on how this should be determined. Results: We use a parsimonious model combining diversification in both the core and accessory genome, including mutation, homologous recombination (HR) and horizontal gene transfer (HGT) introducing new loci, to produce a population of interacting clusters of strains with varying genome content. New loci introduced by HGT may then be transferred on by HR. The model fits well to a systematic population sample of 616 pneumococcal genomes, capturing the major features of the population structure with parameter values that agree well with empirical estimates. Conclusions: The model does not include explicit selection on individual genes, suggesting that crude comparisons of gene content may be a poor predictor of ecological function. We identify a clearly divergent subpopulation of pneumococci that are inconsistent with the model and may be considered genomically incoherent with the rest of the population. These strains have a distinct disease tropism and may be rationally defined as a separate species. We also find deviations from the model that may be explained by recent population bottlenecks or spatial structure.

[1]  Michael U. Gutmann,et al.  Bayesian Optimization for Likelihood-Free Inference of Simulator-Based Statistical Models , 2015, J. Mach. Learn. Res..

[2]  Stephen D. Bentley,et al.  Diversification of bacterial genome content through distinct mechanisms over different timescales , 2014, Nature Communications.

[3]  B. Shapiro,et al.  Ordering microbial diversity into ecologically and genetically cohesive units. , 2014, Trends in microbiology.

[4]  Joakim Näsvall,et al.  Minor fitness costs in an experimental model of horizontal gene transfer in bacteria. , 2014, Molecular biology and evolution.

[5]  Jukka Corander,et al.  Dense genomic sampling identifies highways of pneumococcal recombination , 2014, Nature Genetics.

[6]  David A. Baltrus,et al.  Exploring the costs of horizontal gene transfer. , 2013, Trends in ecology & evolution.

[7]  M. Lipsitch,et al.  Population genomics of post-vaccine changes in pneumococcal epidemiology , 2013, Nature Genetics.

[8]  Peter Pfaffelhuber,et al.  The infinitely many genes model with horizontal gene transfer , 2013, 1301.6547.

[9]  Eugene V. Koonin,et al.  Gene Frequency Distributions Reject a Neutral Model of Genome Evolution , 2013, Genome biology and evolution.

[10]  P. Higgs,et al.  Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome. , 2012, Molecular biology and evolution.

[11]  Joshua S Weitz,et al.  A neutral theory of genome evolution and the frequency distribution of genes , 2012, BMC Genomics.

[12]  Wolfgang R. Hess,et al.  The Infinitely Many Genes Model for the Distributed Genome of Bacteria , 2012, Genome biology and evolution.

[13]  Daniel H. Buckley,et al.  A Model for the Effect of Homologous Recombination on Microbial Diversification , 2011, Genome biology and evolution.

[14]  P. Higgs,et al.  The advantages and disadvantages of horizontal gene transfer and the emergence of the first species , 2011, Biology Direct.

[15]  S. Wood Statistical inference for noisy nonlinear ecological dynamic systems , 2010, Nature.

[16]  C. Fraser,et al.  The Bacterial Species Challenge: Making Sense of Genetic and Ecological Diversity , 2009, Science.

[17]  A. Danchin,et al.  Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths , 2009, PLoS genetics.

[18]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[19]  C. Fraser,et al.  Recombination and the Nature of Bacterial Speciation , 2007, Science.

[20]  Christophe Fraser,et al.  Neutral microepidemic evolution of bacterial pathogens. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[22]  Bernard Derrida,et al.  Genetic distance and species formation in evolving populations , 1992, Journal of Molecular Evolution.

[23]  N. W. Davis,et al.  Genome sequence of enterohaemorrhagic Escherichia coli O157:H7 , 2001, Nature.

[24]  Christopher G. Dowson,et al.  Barriers to Genetic Exchange between Bacterial Species: Streptococcus pneumoniae Transformation , 2000, Journal of bacteriology.

[25]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[26]  F. Taddei,et al.  Molecular keys to speciation: DNA polymorphism and the control of genetic exchange in enterobacteria. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Christian Gourieroux,et al.  Simulation-based econometric methods , 1996 .

[28]  M. Roberts,et al.  The log-linear relationship between sexual isolation and sequence divergence in Bacillus transformation is robust. , 1995, Genetics.

[29]  D. Pollard,et al.  Simulation and the Asymptotics of Optimization Estimators , 1989 .

[30]  D. McFadden A Method of Simulated Moments for Estimation of Discrete Response Models Without Numerical Integration , 1989 .