Protein coding palindromes are a unique but recurrent feature in Rickettsia.

Rickettsia are unique in inserting in-frame a number of palindromic sequences within protein coding regions. In this study, we extensively analyzed repeated sequences in the genome of Rickettsia conorii and examined their locations in regard to coding versus noncoding regions. We identified 656 interspersed repeated sequences classified into 10 distinct families. Of the 10 families, three palindromic sequence families showed clear cases of insertions into open reading frames (ORFs). The location of those in-frame insertions appears to be always compatible with the encoded protein three-dimensional (3-D) fold and function. We provide evidence for a progressive loss of the palindromic property over time after the insertions. This comprehensive study of Rickettsia repeats confirms and extends our previous observations and further indicates a significant role of selfish DNAs in the creation and modification of proteins.

[1]  E. Gilson,et al.  A subfamily of E. coli palindromic units implicated in transcription termination? , 1986, Annales de l'Institut Pasteur. Microbiology.

[2]  D. M. Heithoff,et al.  ssrA (tmRNA) Plays a Role inSalmonella enterica Serovar Typhimurium Pathogenesis , 2000, Journal of bacteriology.

[3]  E. Boedeker,et al.  A Vibrio cholerae pathogenicity island associated with epidemic and pandemic strains. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  E. Gilson,et al.  A family of dispersed repetitive extragenic palindromic DNA sequences in E. coli. , 1984, The EMBO journal.

[5]  A. van der Ende,et al.  Deletion of porA by Recombination between Clusters of Repetitive Extragenic Palindromic Sequences in Neisseria meningitidis , 1999, Infection and Immunity.

[6]  L. Shapiro,et al.  tmRNAs that encode proteolysis-inducing tags are found in all known bacterial genomes: A two-piece tmRNA functions in Caulobacter. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[7]  John C. Wootton,et al.  Sequences with ‘unusual’ amino acid compositions , 1994 .

[8]  J. Weissenbach,et al.  Mechanisms of Evolution in Rickettsia conorii and R. prowazekii , 2001, Science.

[9]  A. Goffeau,et al.  Analysis of the chromosome sequence of the legume symbiont Sinorhizobium meliloti strain 1021 , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[11]  R. G. Lloyd,et al.  A novel repeated DNA sequence located in the intergenic regions of bacterial chromosomes. , 1990, Nucleic acids research.

[12]  E. Gilson,et al.  Palindromic units from E. coli as binding sites for a chromoid‐associated protein , 1986, FEBS letters.

[13]  G. Ames,et al.  Tandem chromosomal duplications: role of REP sequences in the recombination event at the join‐point. , 1990, The EMBO journal.

[14]  M. Kahn,et al.  Integration of satellite bacteriophage P4 in Escherichia coli. DNA sequences of the phage and host regions involved in site-specific recombination. , 1987, Journal of molecular biology.

[15]  T. Meyer,et al.  The repertoire of silent pilus genes in neisseria gonorrhoeae: Evidence for gene conversion , 1986, Cell.

[16]  D Raoult,et al.  Selfish DNA in protein-coding genes of Rickettsia. , 2000, Science.

[17]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[18]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[19]  S. J. Billington,et al.  Delineation of the virulence-related locus (vrl) of Dichelobacter nodosus. , 1995, Microbiology.

[20]  E. Gilson,et al.  Repeated Sequences , 1999 .

[21]  H. Himeno,et al.  [A bacterial RNA that functions both as a tRNA and an mRNA]. , 1998, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[22]  P. Argos,et al.  Analysis of insertions/deletions in protein structures. , 1992, Journal of molecular biology.

[23]  James W. Brown,et al.  Comparative analysis of ribonuclease P RNA using gene sequences from natural microbial populations reveals tertiary structural elements. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[24]  D. P. Pomeranz Krummel,et al.  Verification of phylogenetic predictions in vivo and the importance of the tetraloop motif in a catalytic RNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[26]  S. Inouye,et al.  Retronphage phi R73: an E. coli phage that contains a retroelement and integrates into a tRNA gene. , 1991, Science.

[27]  J. Claverys,et al.  Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae. , 1999, Microbiology.

[28]  J. Andersson,et al.  Genome degradation is an ongoing process in Rickettsia. , 1999, Molecular biology and evolution.

[29]  S. Gottesman,et al.  Excision of a P4-like cryptic prophage leads to Alp protease expression in Escherichia coli , 1994, Journal of bacteriology.

[30]  Selfish DNA and the Origin of Genes , 2001, Science.

[31]  S. Bachellier,et al.  Short palindromic repetitive DNA elements in enterobacteria: a survey. , 1999, Research in microbiology.

[32]  S. J. Billington,et al.  Complete Nucleotide Sequence of the 27-Kilobase Virulence Related Locus (vrl) of Dichelobacter nodosus: Evidence for Extrachromosomal Origin , 1999, Infection and Immunity.

[33]  A. Fersht,et al.  Glutamine, alanine or glycine repeats inserted into the loop of a protein have minimal effects on stability and folding rates. , 1997, Journal of molecular biology.

[34]  Liisa Holm,et al.  COFFEE: an objective function for multiple sequence alignments , 1998, Bioinform..

[35]  L. Liu,et al.  DNA rearrangement mediated by inverted repeats. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Cédric Notredame,et al.  Mocca: semi-automatic method for domain hunting , 2001, Bioinform..