Differentiation of regions with atypical oligonucleotide composition in bacterial genomes

BackgroundComplete sequencing of bacterial genomes has become a common technique of present day microbiology. Thereafter, data mining in the complete sequence is an essential step. New in silico methods are needed that rapidly identify the major features of genome organization and facilitate the prediction of the functional class of ORFs. We tested the usefulness of local oligonucleotide usage (OU) patterns to recognize and differentiate types of atypical oligonucleotide composition in DNA sequences of bacterial genomes.ResultsA total of 163 bacterial genomes of eubacteria and archaea published in the NCBI database were analyzed. Local OU patterns exhibit substantial intrachromosomal variation in bacteria. Loci with alternative OU patterns were parts of horizontally acquired gene islands or ancient regions such as genes for ribosomal proteins and RNAs. OU statistical parameters, such as local pattern deviation (D), pattern skew (PS) and OU variance (OUV) enabled the detection and visualization of gene islands of different functional classes.ConclusionA set of approaches has been designed for the statistical analysis of nucleotide sequences of bacterial genomes. These methods are useful for the visualization and differentiation of regions with atypical oligonucleotide composition prior to or accompanying gene annotation.

[1]  Oleg N. Reva,et al.  Global features of sequences of bacterial chromosomes, plasmids and phages revealed by analysis of oligonucleotide usage patterns , 2004, BMC Bioinformatics.

[2]  H. Ochman,et al.  Amelioration of Bacterial Genomes: Rates of Change and Exchange , 1997, Journal of Molecular Evolution.

[3]  Guy Plunkett,et al.  Comparative Genomics of Salmonellaenterica Serovar Typhi Strains Ty2 and CT18 , 2003, Journal of bacteriology.

[4]  B. Tümmler,et al.  Genome Codon Index of Pseudomonas aeruginosa : A Codon Index That Utilizes Whole Genome Sequence Data , 2002 .

[5]  Anders Sjöstedt,et al.  The complete genome sequence of Francisella tularensis, the causative agent of tularemia , 2005, Nature Genetics.

[6]  D. A. Palmieri,et al.  The genome sequence of the plant pathogen Xylella fastidiosa , 2000, Nature.

[7]  S. Karlin,et al.  Global dinucleotide signatures and analysis of genomic heterogeneity. , 1998, Current opinion in microbiology.

[8]  J. R. van der Meer,et al.  Genomic islands and the evolution of catabolic pathways in bacteria. , 2003, Current opinion in biotechnology.

[9]  O. Ogunseitan,et al.  Tetranucleotide frequencies in microbial genomes , 1998, Electrophoresis.

[10]  M. Blaser,et al.  Evolutionary implications of microbial genome tetranucleotide frequency biases. , 2003, Genome research.

[11]  Shigehiko Kanaya,et al.  Informatics for unveiling hidden genome signatures. , 2003, Genome research.

[12]  General method of rapid Smith/Birnstiel mapping adds for gap closure in shotgun microbial genome sequencing projects: application to Pseudomonas putida KT2440. , 2001, Nucleic acids research.

[13]  Alexander N. Gorban,et al.  Four basic symmetry types in the universal 7-cluster structure of microbial genomic sequences , 2005, Silico Biol..

[14]  S M Payne,et al.  Complete Genome Sequence and Comparative Genomics of Shigella flexneri Serotype 2a Strain 2457T , 2003, Infection and Immunity.

[15]  S. Tabata,et al.  Complete genomic sequence of nitrogen-fixing symbiotic bacterium Bradyrhizobium japonicum USDA110. , 2002, DNA research : an international journal for rapid publication of reports on genes and genomes.

[16]  S Karlin,et al.  Compositional biases of bacterial genomes and evolutionary implications , 1997, Journal of bacteriology.

[17]  M. Blaser,et al.  Identification of Horizontally Acquired Genetic Elements in Helicobacter pylori and Other Prokaryotes Using Oligonucleotide Difference Analysis , 2002 .

[18]  Frank Oliver Glöckner,et al.  TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences , 2004, BMC Bioinformatics.

[19]  K. Nelson,et al.  Global features of the Pseudomonas putida KT2440 genome sequence. , 2002, Environmental microbiology.

[20]  J. Klockgether,et al.  Sequence Analysis of the Mobile Genome Island pKLC102 of Pseudomonas aeruginosa C , 2004, Journal of bacteriology.

[21]  Y. Nakamura,et al.  Complete genome structure of the nitrogen-fixing symbiotic bacterium Mesorhizobium loti. , 2000, DNA research : an international journal for rapid publication of reports on genes and genomes.

[22]  J. Hacker,et al.  Pathogenicity islands and the evolution of microbes. , 2000, Annual review of microbiology.

[23]  D. Ussery,et al.  Comparative Genomics of Pseudomonas aeruginosa PAO1 and Pseudomonas putida KT2440: Orthologs, Codon Usage, Repetitive Extragenic Palindromic Elements, and Oligonucleotide Motif Signatures , 2002 .

[24]  Alessandra Carbone,et al.  Codon adaptation index as a measure of dominating codon bias , 2003, Bioinform..

[25]  Nicole T. Perna,et al.  Molecular Evolution of a Pathogenicity Island from Enterohemorrhagic Escherichia coli O157:H7 , 1998, Infection and Immunity.

[26]  Tsutomu Sato,et al.  The ars Operon in the skinElement of Bacillus subtilis Confers Resistance to Arsenate and Arsenite , 1998, Journal of bacteriology.