Identification, variation and transcription of pneumococcal repeat sequences

BackgroundSmall interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics.ResultsAnalysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR.ConclusionsBOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/.

[1]  Lori A. S. Snyder,et al.  Comparative analysis of two Neisseria gonorrhoeae genome sequences reveals evidence of mobilization of Correia Repeat Enclosed Elements and their role in regulation , 2009, BMC Genomics.

[2]  Pavel A. Pevzner,et al.  De novo identification of repeat families in large genomes , 2005, ISMB.

[3]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[4]  Y. Quentin,et al.  BOX Elements Modulate Gene Expression in Streptococcus pneumoniae: Impact on the Fine-Tuning of Competence Development , 2006, Journal of bacteriology.

[5]  C. Abrescia,et al.  The abundant class of nemis repeats provides RNA substrates for ribonuclease III in Neisseriae. , 2002, Biochimica et biophysica acta.

[6]  S. Salzberg,et al.  Complete Genome Sequence of a Virulent Isolate of Streptococcus pneumoniae , 2001, Science.

[7]  L. Moulin,et al.  Transcription attenuation associated with bacterial repetitive extragenic BIME elements. , 2001, Journal of molecular biology.

[8]  T. Cherian,et al.  Burden of disease caused by Streptococcus pneumoniae in children younger than 5 years: global estimates , 2009, The Lancet.

[9]  J. Weiser,et al.  The genetic basis of colony opacity in Streptococcus pneumoniae: evidence for the effect of box elements on the frequency of phenotypic variation , 1995, Molecular microbiology.

[10]  P. Sharp,et al.  ERIC sequences: a novel family of repetitive elements in the genomes of Escherichia coli, Salmonella typhimurium and other enterobacteria. , 1991, Molecular microbiology.

[11]  T. Tønjum,et al.  New Functional Identity for the DNA Uptake Sequence in Transformation and Its Presence in Transcriptional Terminators , 2006, Journal of bacteriology.

[12]  J. Claverys,et al.  Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae. , 1999, Microbiology.

[13]  P. Sharp,et al.  ERIC sequences: a novel family of repetitive elements in the genomes of Escherichia coli, Salmonella typhimurium and other enterobacteria , 1991, Molecular microbiology.

[14]  B. Barrell,et al.  Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491 , 2000, Nature.

[15]  Jean-Michel Claverie,et al.  Protein coding palindromes are a unique but recurrent feature in Rickettsia. , 2002, Genome research.

[16]  Thomas M. Keane,et al.  A simple method for directional transcriptome sequencing using Illumina technology , 2009, Nucleic acids research.

[17]  Ying Xu,et al.  Nezha, a novel active miniature inverted-repeat transposable element in cyanobacteria. , 2008, Biochemical and biophysical research communications.

[18]  Jörg Schultz,et al.  HMM Logos for visualization of protein families , 2004, BMC Bioinformatics.

[19]  LiHeng,et al.  The Sequence Alignment/Map format and SAMtools , 2009 .

[20]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[21]  P. Alifano,et al.  Whole-genome organization and functional properties of miniature DNA insertion sequences conserved in pathogenic Neisseriae. , 2001, Gene.

[22]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[23]  Sebastian Will,et al.  RNAalifold: improved consensus structure prediction for RNA alignments , 2008, BMC Bioinformatics.

[24]  R de Groot,et al.  Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains , 1996, Journal of clinical microbiology.

[25]  Matthew Berriman,et al.  Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database , 2008, Bioinform..

[26]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[27]  J. Parkhill Annotation of microbial genomes , 2002 .

[28]  N. Thomson,et al.  Studying bacterial transcriptomes using RNA-seq , 2010, Current opinion in microbiology.

[29]  Di Liu,et al.  A Glimpse of Streptococcal Toxic Shock Syndrome from Comparative Genomics of S. suis 2 Chinese Isolates , 2007, PloS one.

[30]  Jeremy S. Brown,et al.  A Streptococcus pneumoniae pathogenicity island encoding an ABC transporter involved in iron uptake and virulence , 2001, Molecular microbiology.

[31]  M. Quail,et al.  Role of Conjugative Elements in the Evolution of the Multidrug-Resistant Pandemic Clone Streptococcus pneumoniaeSpain23F ST81 , 2008, Journal of bacteriology.

[32]  I. Goodhead,et al.  Rapid Evolution of Virulence and Drug Resistance in the Emerging Zoonotic Pathogen Streptococcus suis , 2009, PloS one.

[33]  Satoru Miyano,et al.  Prediction of Transcriptional Terminators in Bacillus subtilis and Related Species , 2005, PLoS Comput. Biol..

[34]  M. Carlomagno,et al.  Enterobacterial Repetitive Intergenic Consensus Sequence Repeats in Yersiniae: Genomic Organization and Functional Properties , 2005, Journal of bacteriology.

[35]  John Walker,et al.  A highly conserved repeated DNA element located in the chromosome of Streptococcus pneumoniae , 1992, Nucleic Acids Res..

[36]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[37]  R. Wambutt,et al.  The Genome of Streptococcus mitis B6 - What Is a Commensal? , 2010, PloS one.

[38]  Eric Coissac,et al.  Origin and fate of repeats in bacteria , 2002, Nucleic Acids Res..

[39]  Sean R. Eddy,et al.  A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation , 2008, PLoS Comput. Biol..

[40]  J. Lupski,et al.  Differential subsequence conservation of interspersed repetitive Streptococcus pneumoniae BOX elements in diverse bacteria. , 1995, Genome research.

[41]  S. Bachellier,et al.  Short palindromic repetitive DNA elements in enterobacteria: a survey. , 1999, Research in microbiology.

[42]  David R. Riley,et al.  Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species , 2010, Genome Biology.

[43]  C. Yanofsky,et al.  Comparison of tryptophan biosynthetic operon regulation in different Gram-positive bacterial species. , 2007, Trends in genetics : TIG.

[44]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[45]  D Raoult,et al.  Selfish DNA in protein-coding genes of Rickettsia. , 2000, Science.