A DNA structural atlas for Escherichia coli.

We have performed a computational analysis of DNA structural features in 18 fully sequenced prokaryotic genomes using models for DNA curvature, DNA flexibility, and DNA stability. The structural values that are computed for the Escherichia coli chromosome are significantly different from (and generally more extreme than) that expected from the nucleotide composition. To aid this analysis, we have constructed tools that plot structural measures for all positions in a long DNA sequence (e.g. an entire chromosome) in the form of color-coded wheels (http://www.cbs.dtu. dk/services/GenomeAtlas/). We find that these "structural atlases" are useful for the discovery of interesting features that may then be investigated in more depth using statistical methods. From investigation of the E. coli structural atlas, we discovered a genome-wide trend, where an extended region encompassing the terminus displays a high of level curvature, a low level of flexibility, and a low degree of helix stability. The same situation is found in the distantly related Gram-positive bacterium Bacillus subtilis, suggesting that the phenomenon is biologically relevant. Based on a search for long DNA segments where all the independent structural measures agree, we have found a set of 20 regions with identical and very extreme structural properties. Due to their strong inherent curvature, we suggest that these may function as topological domain boundaries by efficiently organizing plectonemically supercoiled DNA. Interestingly, we find that in practically all the investigated eubacterial and archaeal genomes, there is a trend for promoter DNA being more curved, less flexible, and less stable than DNA in coding regions and in intergenic DNA without promoters. This trend is present regardless of the absolute levels of the structural parameters, and we suggest that this may be related to the requirement for helix unwinding during initiation of transcription, or perhaps to the previously observed location of promoters at the apex of plectonemically supercoiled DNA. We have also analyzed the structural similarities between groups of genes by clustering all RNA and protein-encoding genes in E. coli, based on the average structural parameters. We find that most ribosomal genes (protein-encoding as well as rRNA genes) cluster together, and we suggest that DNA structure may play a role in the transcription of these highly expressed genes.

[1]  R E Harrington,et al.  Curved DNA without A-A: experimental estimation of all 16 DNA wedge angles. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[2]  R. Sparling,et al.  Regulation in the rpoS regulon of Escherichia coli. , 1998, Canadian journal of microbiology.

[3]  L. Williams,et al.  DNA structure: cations in charge? , 1999, Current opinion in structural biology.

[4]  M. Borodovsky,et al.  Nucleosome DNA sequence pattern revealed by multiple alignment of experimentally mapped sequences. , 1996, Journal of molecular biology.

[5]  L. Loeb,et al.  Structure-function relationships in Escherichia coli promoter DNA. , 1990, Progress in nucleic acid research and molecular biology.

[6]  M. A. El Hassan,et al.  Propeller-twisting of base-pairs and the conformational mobility of dinucleotide steps in DNA. , 1996, Journal of molecular biology.

[7]  D. Suck,et al.  DNA recognition by DNase I , 1994, Journal of molecular recognition : JMR.

[8]  A A Deev,et al.  DNA bendability--a novel feature in E. coli promoter recognition. , 1999, Journal of biomolecular structure & dynamics.

[9]  F. Neidhardt,et al.  Escherichia Coli and Salmonella: Typhimurium Cellular and Molecular Biology , 1987 .

[10]  F. Imamoto,et al.  Properties of DNA-binding of HU heterotypic and homotypic dimers from Escherichia coli. , 1993, Journal of biochemistry.

[11]  B. Barrell,et al.  The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences , 2000, Nature.

[12]  T. Sicheritz-Pontén,et al.  The genome sequence of Rickettsia prowazekii and the origin of mitochondria , 1998, Nature.

[13]  S. Kustu,et al.  The integration host factor stimulates interaction of RNA polymerase with NIFA, the transcriptional activator for nitrogen fixation operons , 1990, Cell.

[14]  R. Huber,et al.  The complete genome of the hyperthermophilic bacterium Aquifex aeolicus , 1998, Nature.

[15]  R. Sinden,et al.  Chromosomes in living Escherichia coli cells are segregated into domains of supercoiling. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[16]  R. Simons,et al.  Chromosomal supercoiling in Escherichia coli , 1993, Molecular microbiology.

[17]  V. Iyer,et al.  Poly(dA:dT), a ubiquitous promoter element that stimulates transcription via its intrinsic DNA structure. , 1995, The EMBO journal.

[18]  V. de Lorenzo,et al.  Promoters responsive to DNA bending: a common theme in prokaryotic gene expression. , 1994, Microbiological reviews.

[19]  R. Fleischmann,et al.  The Minimal Gene Complement of Mycoplasma genitalium , 1995, Science.

[20]  P. Baldi,et al.  Naturally occurring nucleosome positioning signals in human exons and introns. , 1996, Journal of molecular biology.

[21]  V. de Lorenzo,et al.  Integration host factor suppresses promiscuous activation of the sigma 54-dependent promoter Pu of Pseudomonas putida. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[22]  R. Lin,et al.  A repetitive DNA sequence, rhs, responsible for duplications within the Escherichia coli K-12 chromosome. , 1984, Journal of molecular biology.

[23]  C. Chamizo,et al.  A consensus structure for σs‐dependent promoters , 1996, Molecular microbiology.

[24]  G. Felsenfeld,et al.  Chromatin structure and gene expression. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[25]  R. D'ari,et al.  The Escherichia coli histone-like protein HU affects DNA initiation, chromosome partitioning via MukB, and cell division via MinCDE , 1997, Journal of bacteriology.

[26]  Alexander Bolshoy,et al.  Sequence Complexity and DNA Curvature , 1999, Comput. Chem..

[27]  E. Geiduschek,et al.  Localized DNA flexibility contributes to target site selection by DNA-bending proteins. , 1996, Journal of molecular biology.

[28]  R. Simpson,et al.  Nucleosome positioning: occurrence, mechanisms, and functional consequences. , 1991, Progress in nucleic acid research and molecular biology.

[29]  P. Baldi,et al.  DNA structure in human RNA polymerase II promoters. , 1998, Journal of molecular biology.

[30]  R. Ornstein,et al.  An optimized potential function for the calculation of nucleic acid interaction energies I. Base stacking , 1978, Biopolymers.

[31]  P. Stączek,et al.  Gyrase and Topo IV modulate chromosome domain size in vivo , 1998, Molecular microbiology.

[32]  H. Margalit,et al.  Compilation of E. coli mRNA promoter sequences. , 1993, Nucleic acids research.

[33]  S. Darst,et al.  Three-dimensional structure of E. coil core RNA polymerase: Promoter binding and elongation conformations of the enzyme , 1995, Cell.

[34]  M. Salas,et al.  Transcription activation at a distance by phage phi 29 protein p4. Effect of bent and non-bent intervening DNA sequences. , 1991, Journal of molecular biology.

[35]  C. Bustamante,et al.  Wrapping of DNA around the E.coli RNA polymerase open promoter complex , 1999, The EMBO journal.

[36]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[37]  J. Dubochet,et al.  The apical localization of transcribing RNA polymerases on supercoiled DNA prevents their rotation around the template. , 1992, The EMBO journal.

[38]  I. Brukner,et al.  Sequence-dependent structural variations of DNA revealed by DNase I. , 1990, Nucleic acids research.

[39]  J. Gray,et al.  The RhsD-E subfamily of Escherichia coli K-12. , 1991, Nucleic acids research.

[40]  W. Suh,et al.  HO. and DNase I probing of E sigma 70 RNA polymerase--lambda PR promoter open complexes: Mg2+ binding and its structural consequences at the transcription start site. , 1995, Biochemistry.

[41]  D M Crothers,et al.  Sequence elements responsible for DNA curvature. , 1994, Journal of molecular biology.

[42]  E. Bonnefoy,et al.  DNA-binding parameters of the HU protein of Escherichia coli to cruciform DNA. , 1994, Journal of molecular biology.

[43]  D. Vlazny,et al.  Rhs elements of Escherichia coli: a family of genetic composites each encoding a large mosaic protein , 1994, Molecular microbiology.

[44]  L. Søgaard-Andersen,et al.  CRP induces the repositioning of MalT at the Escherichia coli malKp promoter primarily through DNA bending. , 1994, The EMBO journal.

[45]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[46]  C. Hunter,et al.  Sequence-dependent DNA structure. , 1996, BioEssays : news and reviews in molecular, cellular and developmental biology.

[47]  B. Barrell,et al.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence , 1998, Nature.

[48]  B. Magasanik,et al.  DNA bending and the initiation of transcription at sigma54-dependent bacterial promoters. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[49]  D. Ussery,et al.  Analysis of DNA structure in vivo using psoralen photobinding: measurement of supercoiling, topological domains, and DNA-protein interactions. , 1992, Methods in enzymology.

[50]  D. Ussery,et al.  Three views of microbial genomes. , 1999, Research in microbiology.

[51]  R. Wartell,et al.  Sequence distributions associated with DNA curvature are found upstream of strong E. coli promoters. , 1987, Nucleic acids research.

[52]  S. Karlin,et al.  Global dinucleotide signatures and analysis of genomic heterogeneity. , 1998, Current opinion in microbiology.

[53]  R Nussinov,et al.  Sequence dependence of DNA conformational flexibility. , 1989, Biochemistry.

[54]  K. Novak The complete genome sequence… , 1998, Nature Medicine.

[55]  Y. Nakamura,et al.  Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions (supplement). , 1996, DNA research : an international journal for rapid publication of reports on genes and genomes.

[56]  R. Losick,et al.  6 Bacterial Sigma Factors , 1992 .

[57]  H. Hilbert,et al.  Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. , 1996, Nucleic acids research.

[58]  S. Salzberg,et al.  Complete genome sequence of Treponema pallidum, the syphilis spirochete. , 1998, Science.

[59]  I. Brukner,et al.  Trinucleotide models for DNA bending propensity: comparison of models based on DNaseI digestion and nucleosome packaging data. , 1995, Journal of biomolecular structure & dynamics.

[60]  V. Zhurkin,et al.  DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[61]  D. Bussiere,et al.  Termination of DNA replication of bacterial and plasmid chromosomes , 1999, Molecular microbiology.

[62]  N. Cozzarelli,et al.  Use of site-specific recombination as a probe of DNA structure and metabolism in vivo. , 1987, Journal of molecular biology.

[63]  J. Mazzarelli,et al.  DNA curvature does not require bifurcated hydrogen bonds or pyrimidine methyl groups. , 1992, Journal of molecular biology.

[64]  T. Mizuno Random cloning of bent DNA segments from Escherichia coli chromosome and primary characterization of their structures. , 1987, Nucleic acids research.

[65]  Lars Juhl Jensen,et al.  Automatic discovery of regulatory patterns in promoter regions based on whole cell expression data and functional annotation , 2000, Bioinform..

[66]  J. Wang,et al.  Anchoring of DNA to the bacterial cytoplasmic membrane through cotranscriptional synthesis of polypeptides encoding membrane proteins or proteins for export: a mechanism of plasmid hypernegative supercoiling in mutants deficient in DNA topoisomerase I , 1993, Journal of bacteriology.

[67]  P. Sharp,et al.  Pre-bending of a promoter sequence enhances affinity for the TATA-binding factor , 1995, Nature.

[68]  Pierre Baldi,et al.  Computational Applications of DNA Structural Scales , 1998, ISMB.

[69]  C W Hill Large genomic sequence repetitions in bacteria: lessons from rRNA operons and Rhs elements. , 1999, Research in microbiology.

[70]  S. Karlin,et al.  Strand compositional asymmetry in bacterial and large viral genomes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[71]  D. Suck,et al.  DNase I-induced DNA conformation. 2 A structure of a DNase I-octamer complex. , 1991, Journal of molecular biology.

[72]  G. Church,et al.  Complete genome sequence of Methanobacterium thermoautotrophicum deltaH: functional analysis and comparative genomics , 1997, Journal of bacteriology.

[73]  J. Griffith,et al.  Curved helix segments can uniquely orient the topology of supertwisted DNA , 1988, Cell.

[74]  Pierre Baldi,et al.  Structural basis for triplet repeat disorders: a computational analysis , 1999, Bioinform..

[75]  R. W. Davis,et al.  Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. , 1998, Science.

[76]  B. J. Hinnebusch,et al.  The bacterial nucleoid visualized by fluorescence microscopy of cells lysed within agarose: comparison of Escherichia coli and spirochetes of the genus Borrelia , 1997, Journal of bacteriology.

[77]  Sayaka,et al.  Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. , 1996, DNA research : an international journal for rapid publication of reports on genes and genomes.

[78]  J. Feigon,et al.  Localization of ammonium ions in the minor groove of DNA duplexes in solution and the origin of DNA A-tract bending. , 1999, Journal of molecular biology.

[79]  A. Stein,et al.  DNA sequence encodes information for nucleosome array formation. , 1997, Journal of molecular biology.

[80]  Relationship between codon usage and sequence-dependent curvature of genomes. , 1998, Microbial & comparative genomics.

[81]  C. Nickerson,et al.  Role of curved DNA in binding of Escherichia coli RNA polymerase to promoters , 1995, Journal of bacteriology.

[82]  H. Pedersen,et al.  A flexible partnership: the CytR anti‐activator and the cAMP–CRP activator protein, comrades in transcription control , 1996, Molecular microbiology.

[83]  Yong-Dong Wang,et al.  Rhs Elements Comprise Three Subfamilies Which Diverged Prior to Acquisition by Escherichia coli , 1998, Journal of bacteriology.

[84]  H. Margalit,et al.  Determination of common structural features in Escherichia coli promoters by computer analysis. , 1994, European journal of biochemistry.

[85]  T C Ghosh,et al.  Compositional correlation studies among the three different codon positions in 12 bacterial genomes. , 1999, Biochemical and biophysical research communications.

[86]  S. Salzberg,et al.  Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi , 1997, Nature.

[87]  V. de Lorenzo,et al.  Clues and consequences of DNA bending in transcription. , 1997, Annual review of microbiology.

[88]  S Karlin,et al.  Codon usages in different gene classes of the Escherichia coli genome , 1998, Molecular microbiology.

[89]  N. Sueoka Cell membrane and chromosome replication in Bacillus subtilis. , 1998, Progress in nucleic acid research and molecular biology.

[90]  M. Merrick In a class of its own--the RNA polymerase sigma factor sigma 54 (sigma N). , 1993, Molecular microbiology.

[91]  H. Drew,et al.  Sequence periodicities in chicken nucleosome core DNA. , 1986, Journal of molecular biology.

[92]  A. Tormo,et al.  Sigma s-dependent promoters in Escherichia coli are located in DNA regions with intrinsic curvature. , 1993, Nucleic acids research.

[93]  P. Krausa,et al.  Complete sequence analysis of the A*1103 allele. , 2000, Tissue antigens.

[94]  R. Fleischmann,et al.  Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii , 1996, Science.

[95]  D. K. Hawley,et al.  DNA bending is an important component of site-specific recognition by the TATA binding protein. , 1995, Journal of molecular biology.

[96]  C. Higgins,et al.  Chromosomal domains of supercoiling in Salmonella typhimurium , 1993, Molecular microbiology.

[97]  M. Shimizu,et al.  Characterization of the binding of HU and IHF, homologous histone-like proteins of Escherichia coli, to curved and uncurved DNA. , 1995, Biochimica et biophysica acta.

[98]  C. Hunter,et al.  Sequence-dependent DNA structure. The role of base stacking interactions. , 1993, Journal of molecular biology.

[99]  L. Bracco,et al.  Synthetic curved DNA sequences can act as transcriptional activators in Escherichia coli. , 1989, The EMBO journal.

[100]  R. Lobell,et al.  AraC-DNA looping: orientation and distance-dependent loop breaking by the cyclic AMP receptor protein. , 1991, Journal of molecular biology.

[101]  H. Buc,et al.  Topological unwinding of strong and weak promoters by RNA polymerase. A comparison between the lac wild-type and the UV5 sites of Escherichia coli. , 1987, Journal of molecular biology.

[102]  H. Heumann,et al.  Topography of intermediates in transcription initiation of E.coli. , 1990, The EMBO journal.

[103]  R. Sinden DNA Structure and Function , 1994 .

[104]  A. Goffeau,et al.  The complete genome sequence of the Gram-positive bacterium Bacillus subtilis , 1997, Nature.

[105]  H. Schellhorn,et al.  Identification of Conserved, RpoS-Dependent Stationary-Phase Genes of Escherichia coli , 1998, Journal of Bacteriology.

[106]  R. Fleischmann,et al.  The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus , 1997, Nature.

[107]  M. Caroff,et al.  Alterations of the outer membrane composition in Escherichia coli lacking the histone-like protein HU. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[108]  I. T. Young Proof without prejudice: use of the Kolmogorov-Smirnov test for the analysis of histograms from flow systems and other sources. , 1977, The journal of histochemistry and cytochemistry : official journal of the Histochemistry Society.

[109]  M. Salas,et al.  Transcription activation at a distance by phage ø29 protein p4 , 1991 .

[110]  I. Brukner,et al.  Sequence‐dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. , 1995, The EMBO journal.

[111]  S Brunak,et al.  Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima. , 2000, Nucleic acids research.

[112]  A. Worcel,et al.  On the structure of the folded chromosome of Escherichia coli. , 1972, Journal of molecular biology.

[113]  J. Roth,et al.  Surveying a supercoil domain by using the gamma delta resolution system in Salmonella typhimurium , 1996, Journal of bacteriology.

[114]  M. Beltrame,et al.  Protein HU binds specifically to kinked DNA , 1993, Molecular microbiology.

[115]  Edward N. Trifonov,et al.  CURVATURE: software for the analysis of curved DNA , 1993, Comput. Appl. Biosci..

[116]  Zhiwu Zhu,et al.  A Specialized Nucleosome Modulates Transcription Factor Access to a C. glabrata Metal Responsive Promoter , 1996, Cell.

[117]  J. Vanwye,et al.  Species-specific patterns of DNA bending and sequence. , 1991, Nucleic acids research.

[118]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[119]  F. Robb,et al.  Complete sequence and gene organization of the genome of a hyper-thermophilic archaebacterium, Pyrococcus horikoshii OT3. , 1998, DNA research : an international journal for rapid publication of reports on genes and genomes.

[120]  Mark Borodovsky,et al.  The complete genome sequence of the gastric pathogen Helicobacter pylori , 1997, Nature.