The large majority of intergenic sites in bacteria are selectively constrained, even when known regulatory elements are excluded

There are currently no broad estimates of the overall strength and direction of selection operating on intergenic variation in bacteria. Here we address this using large whole genome sequence datasets representing six diverse bacterial species; Escherichia coli, Staphylococcus aureus, Salmonella enterica, Streptococcus pneumoniae, Klebsiella pneumoniae, and Mycobacterium tuberculosis. Excluding M. tuberculosis, we find that a high proportion (62%-79%; mean 70%) of intergenic sites are selectively constrained, relative to synonymous sites. Non-coding RNAs tend to be under stronger selective constraint than promoters, which in turn are typically more constrained than rho-independent terminators. Even when these regulatory elements are excluded, the mean proportion of constrained intergenic sites only falls to 69%; thus our current understanding of the functionality of intergenic regions (IGRs) in bacteria is severely limited. Consistent with a role for positive as well as negative selection on intergenic sites, we present evidence for strong positive selection in Mycobacterium tuberculosis promoters, underlining the key role of regulatory changes as an adaptive mechanism in this highly monomorphic pathogen.

[1]  P. Sears RNA , 2018, Catalysis from A to Z.

[2]  M. Maiden,et al.  Population and Functional Genomics of Neisseria Revealed with Gene-by-Gene Approaches , 2016, Journal of Clinical Microbiology.

[3]  Thomas Abeel,et al.  Genomic and functional analyses of Mycobacterium tuberculosis strains implicate ald in D-cycloserine resistance , 2016, Nature Genetics.

[4]  J. Parkhill,et al.  Building a genomic framework for prospective MRSA surveillance in the United Kingdom and the Republic of Ireland , 2016, Genome research.

[5]  Richard E. Lenski,et al.  Tempo and mode of genome evolution in a 50,000-generation experiment , 2016, Nature.

[6]  E. Feil Toward a synthesis of genotypic typing and phenotypic inference in the genomics era. , 2015, Future microbiology.

[7]  Nathan D. Price,et al.  Integrated Modeling of Gene Regulatory and Metabolic Networks in Mycobacterium tuberculosis , 2015, PLoS Comput. Biol..

[8]  Jonathan Wilksch,et al.  Genomic analysis of diversity, population structure, virulence, and antimicrobial resistance in Klebsiella pneumoniae, an urgent threat to public health , 2015, Proceedings of the National Academy of Sciences.

[9]  R. Warren,et al.  Phylogeny to function: PE/PPE protein evolution and impact on Mycobacterium tuberculosis pathogenicity , 2015, Molecular microbiology.

[10]  Mark M. Tanaka,et al.  Defining the Core Genome of Salmonella enterica Serovar Typhimurium for Genomic Surveillance and Epidemiological Typing , 2015, Journal of Clinical Microbiology.

[11]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[12]  Mario Recker,et al.  Predicting the virulence of MRSA from its genome sequence , 2014, Genome research.

[13]  Pascale Romby,et al.  A Non-Coding RNA Promotes Bacterial Persistence and Decreases Virulence by Regulating a Regulator in Staphylococcus aureus , 2014, PLoS pathogens.

[14]  Jukka Corander,et al.  Dense genomic sampling identifies highways of pneumococcal recombination , 2014, Nature Genetics.

[15]  Jukka Corander,et al.  Evolution and transmission of drug resistant tuberculosis in a Russian population , 2014, Nature Genetics.

[16]  C. Sassetti,et al.  Differential roles for the Co2+/Ni2+ transporting ATPases, CtpD and CtpJ, in Mycobacterium tuberculosis virulence , 2014, Molecular microbiology.

[17]  J. Bray,et al.  MLST revisited: the gene-by-gene approach to bacterial genomics , 2013, Nature Reviews Microbiology.

[18]  Jianzhi Zhang,et al.  No gene-specific optimization of mutation rate in Escherichia coli. , 2013, Molecular biology and evolution.

[19]  Feng-Chi Chen,et al.  The evolutionary landscape of the Mycobacterium tuberculosis genome. , 2013, Gene.

[20]  R. Houlston,et al.  Generation of Artificial FASTQ Files to Evaluate the Performance of Next-Generation Sequencing Pipelines , 2012, PloS one.

[21]  Brian Luna,et al.  Gene Expression of Mycobacterium tuberculosis Putative Transcription Factors whiB1-7 in Redox Environments , 2012, PloS one.

[22]  Jeffrey Green,et al.  Structure-Function Relationships of the Mycobacterium tuberculosis Transcription Factor WhiB1 , 2012, PloS one.

[23]  Keith A. Jolley,et al.  A Gene-By-Gene Approach to Bacterial Population Genomics: Whole Genome MLST of Campylobacter , 2012, Genes.

[24]  Á. Zaballos,et al.  Identification of 88 regulatory small RNAs in the TIGR4 strain of the human pathogen Streptococcus pneumoniae. , 2012, RNA.

[25]  Stephen D. Bentley,et al.  Microevolution of extensively drug-resistant tuberculosis in Russia. , 2012, Genome research.

[26]  O. Kuipers,et al.  PePPER: a webserver for prediction of prokaryote promoter elements and regulons , 2012, BMC Genomics.

[27]  Howard Ochman,et al.  Sequence Conservation and Functional Constraint on Intergenic Spacers in Reduced Genomes of the Obligate Symbiont Buchnera , 2011, PLoS genetics.

[28]  E. Yang,et al.  A Salmonella Small Non-Coding RNA Facilitates Bacterial Invasion and Intracellular Replication by Modulating the Expression of Virulence Factors , 2011, PLoS pathogens.

[29]  L. Hurst,et al.  Atypical AT Skew in Firmicute Genomes Results from Selection and Not from Mutation , 2011, PLoS genetics.

[30]  J. Parkhill,et al.  The Impact of Recombination on dN/dS within Recently Emerged Bacterial Clones , 2011, PLoS pathogens.

[31]  Haiwei Luo,et al.  Ongoing purifying selection on intergenic spacers in group A streptococcus. , 2011, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[32]  Martin C. J. Maiden,et al.  BIGSdb: Scalable analysis of bacterial genome variation at the population level , 2010, BMC Bioinformatics.

[33]  F. Hildebrand,et al.  Evidence of Selection upon Genomic GC-Content in Bacteria , 2010, PLoS genetics.

[34]  E. Rocha,et al.  Mutational Patterns Cannot Explain Genome Composition: Are There Any Neutral Sites in the Genomes of Bacteria? , 2010, PLoS genetics.

[35]  S. Chauhan,et al.  CmtR, a cadmium‐sensing ArsR–SmtB repressor, cooperatively interacts with multiple operator sites to autorepress its transcription in Mycobacterium tuberculosis , 2009, The FEBS journal.

[36]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[37]  G. Storz,et al.  Regulatory RNAs in Bacteria , 2009, Cell.

[38]  E. Rocha,et al.  The temporal dynamics of slightly deleterious mutations in Escherichia coli and Shigella spp. , 2009, Molecular biology and evolution.

[39]  Erik van Nimwegen,et al.  Universal patterns of purifying selection at noncoding positions in bacteria. , 2007, Genome research.

[40]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[41]  Eduardo P C Rocha,et al.  Comparisons of dN/dS are time dependent for closely related bacterial genomes. , 2006, Journal of theoretical biology.

[42]  Laurent Excoffier,et al.  Conserved noncoding sequences are selectively constrained and not mutation cold spots , 2006, Nature Genetics.

[43]  Adaptation of Multilocus Sequencing for Studying Variation Within a Major Clone: Evolutionary Relationships of Salmonella enterica Serovar Typhimurium , 2006, Genetics.

[44]  P. Sharp,et al.  Variation in the strength of selected codon usage bias among bacteria , 2005, Nucleic acids research.

[45]  H. Sprecher,et al.  The Mycobacterium tuberculosis pks2 Gene Encodes the Synthase for the Hepta- and Octamethyl-branched Fatty Acids Required for Sulfolipid Synthesis* , 2001, The Journal of Biological Chemistry.

[46]  P. Keightley,et al.  Deleterious mutations and the evolution of sex. , 2000, Science.

[47]  S. Osawa,et al.  The guanine and cytosine content of genomic DNA and bacterial evolution. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[48]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[49]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[50]  T. Ohta Slightly Deleterious Mutant Substitutions in Evolution , 1973, Nature.

[51]  T. Ohta,et al.  Protein Polymorphism as a Phase of Molecular Evolution , 1971, Nature.

[52]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .