Bacterial Genomes
暂无分享,去创建一个
Bacterial similarity relationships are inferred using sequence information derived from large aggregates of genomic sequences. Comparisons within and between species sample sequences are based on the vector of dinucleotide relative abundance values, referred to as the genomic signature. Recent studies have demonstrated that the dinucleotide relative abundance values (profiles) of different DNA sequence samples (sample size ;:::50kb) from the same organism are generally much more similar to each other than they are to profiles from other organisms, and that closely related organisms generally have more similar profiles than do distantly related organisms. These highly stable DNA-doublet profiles suggest that there may be genome-wide factors such as functions of replication and repair machinery that impose limits on the compositional and structural patterns of a genomic sequence. The genomic signatures of all prokaryotic genomes with available non-redundant DNA of at least about 100 kb were compared. These include 21 proteobacteria, 10 Gram-positives, three cyanobacteria, three archaea, and three unclassified sequences. Among specific results, the genomic signature of thermophilic Archaea deviates substantially from the signature of halophilic Archaea. Anabaena sequences are relatively close to the Gram-positives L. lactis and S. aureus. Gram-positives divide into at least five subgroups. The dinucleotide TA is almost universally under-represented; GC is pervasively over-represented in y and p-proteobacterial genomes; AT is high in a-proteobacteria, and CG is low in many thermophiles. Interpretations center on DNA structures (e.g., basestep stacking energies, conformational tendencies) and context-dependent mutational biases.