Small variable segments constitute a major type of diversity of bacterial genomes at the species level

BackgroundAnalysis of large scale diversity in bacterial genomes has mainly focused on elements such as pathogenicity islands, or more generally, genomic islands. These comprise numerous genes and confer important phenotypes, which are present or absent depending on strains. We report that despite this widely accepted notion, most diversity at the species level is composed of much smaller DNA segments, 20 to 500 bp in size, which we call microdiversity.ResultsWe performed a systematic analysis of the variable segments detected by multiple whole genome alignments at the DNA level on three species for which the greatest number of genomes have been sequenced: Escherichia coli, Staphylococcus aureus, and Streptococcus pyogenes. Among the numerous sites of variability, 62 to 73% were loci of microdiversity, many of which were located within genes. They contribute to phenotypic variations, as 3 to 6% of all genes harbor microdiversity, and 1 to 9% of total genes are located downstream from a microdiversity locus. Microdiversity loci are particularly abundant in genes encoding membrane proteins. In-depth analysis of the E. coli alignments shows that most of the diversity does not correspond to known mobile or repeated elements, and it is likely that they were generated by illegitimate recombination. An intriguing class of microdiversity includes small blocks of highly diverged sequences, whose origin is discussed.ConclusionsThis analysis uncovers the importance of this small-sized genome diversity, which we expect to be present in a wide range of bacteria, and possibly also in many eukaryotic genomes.

[1]  D. Dykhuizen,et al.  High frequency of hotspot mutations in core genes of Escherichia coli due to short-term positive selection , 2009, Proceedings of the National Academy of Sciences.

[2]  Samuel V. Angiuoli,et al.  Insights on Evolution of Virulence and Resistance from the Complete Genome Analysis of an Early Methicillin-Resistant Staphylococcus aureus Strain and a Biofilm-Producing Methicillin-Resistant Staphylococcus epidermidis Strain , 2005, Journal of bacteriology.

[3]  Georgios S. Vernikos,et al.  Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands , 2006, Bioinform..

[4]  Meriem El Karoui,et al.  A Genomic Distance Based on MUM Indicates Discontinuity between Most Bacterial Species and Genera , 2008, Journal of bacteriology.

[5]  R. L. Charlebois Organization of the Prokaryotic Genome , 1999 .

[6]  M. Kato,et al.  Very small mobile repeated elements in cyanobacterial genomes. , 2008, Genome research.

[7]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[8]  S. Lovett,et al.  A sister-strand exchange mechanism for recA-independent deletion of repeated DNA sequences in Escherichia coli. , 1993, Genetics.

[9]  S. Cramton,et al.  Identification of a New Repetitive Element inStaphylococcus aureus , 2000, Infection and Immunity.

[10]  A. Mathieu,et al.  Unveiling Novel RecO Distant Orthologues Involved in Homologous Recombination , 2008, PLoS genetics.

[11]  P. Glaser,et al.  Shaping a bacterial genome by large chromosomal replacements, the evolutionary history of Streptococcus agalactiae , 2008, Proceedings of the National Academy of Sciences.

[12]  Ikuo Uchiyama,et al.  Evolution of paralogous genes: Reconstruction of genome rearrangements through comparison of multiple genomes within Staphylococcus aureus. , 2006, Molecular biology and evolution.

[13]  O. Tenaillon,et al.  Extraintestinal virulence is a coincidental by-product of commensalism in B2 phylogenetic group Escherichia coli strains. , 2007, Molecular biology and evolution.

[14]  Artem Cherkasov,et al.  Relationship between insertion/deletion (indel) frequency of proteins and essentiality , 2007, BMC Bioinformatics.

[15]  J. Glasner,et al.  Genome-wide detection and analysis of homologous recombination among sequenced strains of Escherichia coli , 2006, Genome Biology.

[16]  Guillaume Pavlovic,et al.  Conjugative transposons: the tip of the iceberg , 2002, Molecular microbiology.

[17]  Christophe Caron,et al.  MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level , 2008, BMC Bioinformatics.

[18]  Jacques van Helden,et al.  Prophinder: a computational tool for prophage prediction in prokaryotic genomes , 2008, Bioinform..

[19]  S. Kurtz The Vmatch large scale sequence analysis software , 2003 .

[20]  N. Brissett,et al.  Nonhomologous end-joining in bacteria: a microbial perspective. , 2007, Annual review of microbiology.

[21]  E. Denamur,et al.  aes, the gene encoding the esterase B in Escherichia coli, is a powerful phylogenetic marker of the species , 2009, BMC Microbiology.

[22]  F. Blattner,et al.  Mauve: multiple alignment of conserved genomic sequence with rearrangements. , 2004, Genome research.

[23]  J R Roth,et al.  Tandem genetic duplications in phage and bacteria. , 1977, Annual review of microbiology.

[24]  L. Marraffini,et al.  CRISPR Interference Limits Horizontal Gene Transfer in Staphylococci by Targeting DNA , 2008, Science.

[25]  D. Ussery,et al.  Distribution and characterization of staphylococcal interspersed repeat units (SIRUs) and potential use for strain differentiation. , 2004, Microbiology.

[26]  H. Ikeda,et al.  A novel assay for illegitimate recombination in Escherichia coli: stimulation of lambda bio transducing phage formation by ultra-violet light and its independence from RecA function. , 1995, Advances in biophysics.

[27]  François Taddei,et al.  Evolutionary Implications of the Frequent Horizontal Transfer of Mismatch Repair Genes , 2000, Cell.

[28]  Ingmar Reuter,et al.  Integr8 and Genome Reviews: integrated views of complete genomes and proteomes , 2004, Nucleic Acids Res..

[29]  J. Claverys,et al.  Adaptation to the environment: Streptococcus pneumoniae, a paradigm for recombination‐mediated genetic plasticity? , 2000, Molecular microbiology.

[30]  B. Michel,et al.  DNA transcription and repressor binding affect deletion formation in Escherichia coli plasmids. , 1992, The EMBO journal.

[31]  B. Michel Illegitimate Recombination in Bacteria , 1999 .

[32]  D. Fouts Phage_Finder: Automated identification and classification of prophage regions in complete bacterial genome sequences , 2006, Nucleic acids research.

[33]  Marie-Agnès Petit,et al.  The λ Red Proteins Promote Efficient Recombination between Diverged Sequences: Implications for Bacteriophage Genome Mosaicism , 2008, PLoS genetics.

[34]  Enno Ohlebusch,et al.  Efficient multiple genome alignment , 2002, ISMB.

[35]  E. Gilson,et al.  The BIME family of bacterial highly repetitive sequences. , 1991, Research in microbiology.

[36]  Robert Barber,et al.  Prophage Finder: A Prophage Loci Prediction Tool for Prokaryotic Genome Sequences , 2006, Silico Biol..

[37]  Kelly P. Williams,et al.  Islander: a database of integrative islands in prokaryotic genomes, the associated integrases and their DNA site specificities , 2004, Nucleic Acids Res..

[38]  Matthew W. Dimmic,et al.  Genes under positive selection in Escherichia coli. , 2007, Genome research.

[39]  S. Sommer,et al.  Evidence for mutation showers , 2007, Proceedings of the National Academy of Sciences.

[40]  C. Médigue,et al.  MaGe: a microbial genome annotation system supported by synteny results , 2006, Nucleic acids research.

[41]  Meriem El Karoui,et al.  Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops , 2005, BMC Bioinformatics.

[42]  N. W. Davis,et al.  Genome sequence of enterohaemorrhagic Escherichia coli O157:H7 , 2001, Nature.

[43]  S. Ehrlich,et al.  Replication slippage involves DNA polymerase pausing and dissociation , 2001, The EMBO journal.

[44]  R. Stephens,et al.  Genome and gene alterations by insertions and deletions in the evolution of human and chimpanzee chromosome 22 , 2009, BMC Genomics.

[45]  Cheng-Yan Kao,et al.  VNTRDB: a bacterial variable number tandem repeat locus database , 2006, Nucleic Acids Res..

[46]  B. Nunes,et al.  Evolutionary Dynamics of ompA, the Gene Encoding the Chlamydia trachomatis Key Antigen , 2009, Journal of Bacteriology.

[47]  Wilfried Wackernagel,et al.  Integration of foreign DNA during natural transformation of Acinetobacter sp. by homology-facilitated illegitimate recombination , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[48]  A. Tomkinson,et al.  Mycobacterial Ku and Ligase Proteins Constitute a Two-Component NHEJ Repair Machine , 2004, Science.

[49]  I. Matic,et al.  Role of Intraspecies Recombination in the Spread of Pathogenicity Islands within the Escherichia coli Species , 2009, PLoS pathogens.

[50]  Nikos Kyrpides,et al.  CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats , 2007, BMC Bioinformatics.

[51]  A. Wagner,et al.  A survey of bacterial insertion sequences using IScan , 2007, Nucleic acids research.

[52]  W. Wackernagel,et al.  Mechanisms of homology‐facilitated illegitimate recombination for foreign DNA acquisition in transformable Pseudomonas stutzeri , 2003, Molecular microbiology.

[53]  R. Novick,et al.  Phage-Mediated Intergeneric Transfer of Toxin Genes , 2009, Science.

[54]  M. Rossignol,et al.  Macrodomain organization of the Escherichia coli chromosome , 2004, The EMBO journal.

[55]  Kumar Rajakumar,et al.  A novel strategy for the identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites in closely related bacteria , 2006, Nucleic acids research.

[56]  R. Barrangou,et al.  CRISPR Provides Acquired Resistance Against Viruses in Prokaryotes , 2007, Science.

[57]  J. Calvete,et al.  Staphylococcus aureus Pathogenicity Island DNA Is Packaged in Particles Composed of Phage Proteins , 2008, Journal of bacteriology.

[58]  D. Gordenin,et al.  Hypermutability of Damaged Single-Strand DNA Formed at Double-Strand Breaks and Uncapped Telomeres in Yeast Saccharomyces cerevisiae , 2008, PLoS genetics.

[59]  Gilles Vergnaud,et al.  Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains : a web-based resource , 2004, BMC Bioinformatics.

[60]  V. Bidnenko,et al.  Replication mutations differentially enhance RecA‐dependent and RecA‐independent recombination between tandem repeats in Bacillus subtilis , 2001, Molecular microbiology.

[61]  S. Ehrlich,et al.  Replication Slippage of Different DNA Polymerases Is Inversely Related to Their Strand Displacement Efficiency* , 1999, The Journal of Biological Chemistry.

[62]  Gerald R. Smith Conjugational recombination in E. coli: Myths and mechanisms , 1991, Cell.

[63]  H. Ikeda,et al.  Short-homology-independent illegitimate recombination in Escherichia coli: distinct mechanism from short-homology-dependent illegitimate recombination. , 1997, Journal of molecular biology.

[64]  E. Dervyn,et al.  Frequency of deletion formation decreases exponentially with distance between short direct repeats , 1994, Molecular microbiology.

[65]  J. Claverys,et al.  Homologous recombination at the border: Insertion-deletions and the trapping of foreign DNA in Streptococcus pneumoniae , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Steven J. M. Jones,et al.  IslandPath: aiding detection of genomic islands in prokaryotes , 2003, Bioinform..

[67]  P. Jeggo,et al.  Identification of a DNA Nonhomologous End-Joining Complex in Bacteria , 2002, Science.

[68]  A. Danchin,et al.  Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths , 2009, PLoS genetics.

[69]  Xavier Messeguer,et al.  M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species , 2006, BMC Bioinformatics.

[70]  Patricia Siguier,et al.  ISfinder: the reference centre for bacterial insertion sequences , 2005, Nucleic Acids Res..

[71]  Henry Huang,et al.  Homologous recombination in Escherichia coli: dependence on substrate length and homology. , 1986, Genetics.

[72]  Ibtissem Grissa,et al.  The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats , 2007, BMC Bioinformatics.

[73]  M. Hattori,et al.  Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. , 2001, DNA research : an international journal for rapid publication of reports on genes and genomes.

[74]  Ying Xu,et al.  Insertion Sequences show diverse recent activities in Cyanobacteria and Archaea , 2008, BMC Genomics.

[75]  S. Ehrlich,et al.  Copy-choice recombination mediated by DNA polymerase III holoenzyme from Escherichia coli. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[76]  J. Drake Mutations in clusters and showers , 2007, Proceedings of the National Academy of Sciences.

[77]  G. Reitz,et al.  Role of DNA Repair by Nonhomologous-End Joining in Bacillus subtilis Spore Resistance to Extreme Dryness, Mono- and Polychromatic UV, and Ionizing Radiation , 2007, Journal of bacteriology.

[78]  J. Majewski,et al.  DNA sequence similarity requirements for interspecific recombination in Bacillus. , 1999, Genetics.

[79]  Georgios S. Vernikos,et al.  Resolving the structural features of genomic islands: a machine learning approach. , 2008, Genome research.