Remarkable sequence signatures in archaeal genomes.

Complete archaeal genomes were probed for the presence of long (> or = 25 bp) oligonucleotide repeats (words). We detected the presence of many words distributed in tandem with narrow ranges of periodicity (i.e., spacer length between repeats). Similar words were not identified in genomes of non-archaeal species, namely Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Mycoplasma genitalium and Mycoplasma pneumoniae. BLAST similarity searches against the GenBank nucleotide sequence database revealed that these words were archaeal species-specific, indicating that they are of a signature character. Sequence analysis and genome viewing tools showed these repeats to be restricted to non-coding regions. Thus, archaea appear to possess a non-coding genomic signature that is absent in bacterial species. The identification of a species-specific genomic signature would be of great value to archaeal genome mapping, evolutionary studies and analyses of genome complexity.

[1]  S. Karlin,et al.  Dinucleotide relative abundance extremes: a genomic signature. , 1995, Trends in genetics : TIG.

[2]  T. Werner,et al.  MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. , 1995, Nucleic acids research.

[3]  R. Garrett,et al.  Mobile elements in archaeal genomes. , 2002, FEMS microbiology letters.

[4]  L. Schouls,et al.  Identification of genes that are associated with DNA repeats in prokaryotes , 2002, Molecular microbiology.

[5]  J. Reeve,et al.  DNA repeats and archaeal nucleosome positioning. , 1999, Research in microbiology.

[6]  P. Deschavanne,et al.  Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. , 1999, Molecular biology and evolution.

[7]  C. Rodríguez,et al.  Repeated sequences in bacterial chromosomes and plasmids: a glimpse from sequenced genomes. , 1999, Research in microbiology.

[8]  J. Collado-Vides,et al.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. , 1998, Journal of molecular biology.

[9]  G. Pesole,et al.  WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences. , 1992, Nucleic acids research.

[10]  N. Ogata,et al.  Elongation of tandem repetitive DNA by the DNA polymerase of the hyperthermophilic archaeon Thermococcus litoralis at a hairpin-coil transitional state: a model of amplification of a primordial simple DNA sequence. , 2000, Biochemistry.

[11]  S. Mirkin,et al.  Characteristic enrichment of DNA repeats in different genomes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[12]  E. Koonin,et al.  Genome of the Extremely Radiation-Resistant Bacterium Deinococcus radiodurans Viewed from the Perspective of Comparative Genomics , 2001, Microbiology and Molecular Biology Reviews.

[13]  Eric Coissac,et al.  Origin and fate of repeats in bacteria , 2002, Nucleic Acids Res..

[14]  L. Schouls,et al.  Identification of a novel family of sequence repeats among prokaryotes. , 2002, Omics : a journal of integrative biology.

[15]  S Karlin,et al.  Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[16]  J Heringa,et al.  Detection of internal repeats: how common are they? , 1998, Current opinion in structural biology.

[17]  A. Hüttenhofer,et al.  Identification of 86 candidates for small non-messenger RNAs from the archaeon Archaeoglobus fulgidus , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  S. Cole,et al.  Repetitive sequences in Mycobacterium leprae and their impact on genome plasticity. , 2001, Leprosy review.

[19]  K. Kanehori,et al.  Determination of the complete genomic DNA sequence of Thermoplasma volcanium GSS1 , 1999 .

[20]  A Danchin,et al.  Oligonucleotide bias in Bacillus subtilis: general trends and taxonomic comparisons. , 1998, Nucleic acids research.

[21]  Eugene W. Myers,et al.  Xlandscape: the graphical display of word frequencies in sequences , 1998, Bioinform..