Co-expression pattern from DNA microarray experiments as a tool for operon prediction

The prediction of operons, the smallest unit of transcription in prokaryotes, is the first step towards reconstruction of a regulatory network at the whole genome level. Sequence information, in particular the distance between open reading frames, has been used to predict if adjacent Escherichia coli genes are in an operon. While appreciably successful, these predictions need to be validated and refined experimentally. As a growing number of gene expression array experiments on E.coli became available, we investigated to what extent they could be used to improve and validate these predictions. To this end, we examined a large collection of published microarry data. The correlation between expression ratios of adjacent genes was used in a Bayesian classification scheme to predict whether the genes are in an operon or not. We found that for the genes whose expression levels change significantly across the experiments in the data set, the currently available gene expression data allowed a significant refinement of the sequenced-based predictions. We report these co-expression correlations in an E.coli genomic map. For a significant portion of gene pairs, however, the set of array experiments considered did not contain sufficient information to determine whether they are in the same transcriptional unit. This is not due to unreliability of the array data per se, but to the design of the experiments analyzed. In general, experiments that perturb a large number of genes offer more information for operon prediction than confined perturbations. These results provide a rationale for conducting expression studies comparing conditions that cause global changes in gene expression.

[1]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[2]  J. Glasner,et al.  Genome-wide expression profiling in Escherichia coli K-12. , 1999, Nucleic acids research.

[3]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[4]  S. Salzberg,et al.  Prediction of transcription terminators in bacterial genomes. , 2000, Journal of molecular biology.

[5]  S. Salzberg,et al.  Prediction of operons in microbial genomes. , 2001, Nucleic acids research.

[6]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[7]  Temple F. Smith,et al.  Operons in Escherichia coli: genomic analyses and predictions. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[8]  D. Botstein,et al.  DNA microarray analysis of gene expression in response to physiological and genetic changes that affect tryptophan metabolism in Escherichia coli. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[9]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[10]  J. Courcelle,et al.  Comparative gene expression profiles following UV exposure in wild-type and SOS-deficient Escherichia coli. , 2001, Genetics.

[11]  Julio Collado-Vides,et al.  RegulonDB (version 3.2): transcriptional regulation and operon organization in Escherichia coli K-12 , 2001, Nucleic Acids Res..

[12]  P. Pomposiello,et al.  Genome-Wide Transcriptional Profiling of theEscherichia coli Responses to Superoxide Stress and Sodium Salicylate , 2001, Journal of bacteriology.

[13]  M. Oh,et al.  Gene Expression Profiling by DNA Microarrays and Metabolic Fluxes in Escherichiacoli , 2000, Biotechnology progress.

[14]  G. W. Hatfield,et al.  Global gene expression profiling in Escherichia coli K12. The effects of integration host factor. , 2000, The Journal of biological chemistry.

[15]  J. Lengeler,et al.  Molecular analysis of the gat genes from Escherichia coli and of their roles in galactitol transport and metabolism , 1996, Journal of bacteriology.

[16]  M. Oh,et al.  DNA microarray detection of metabolic responses to protein overproduction in Escherichia coli. , 2000, Metabolic engineering.

[17]  K. Shanmugam,et al.  Engineering a Homo-Ethanol Pathway inEscherichia coli: Increased Glycolytic Flux and Levels of Expression of Glycolytic Genes during Xylose Fermentation , 2001, Journal of bacteriology.

[18]  Julio Collado-Vides,et al.  RegulonDB (version 3.0): transcriptional regulation and operon organization in Escherichia coli K-12 , 2000, Nucleic Acids Res..

[19]  E. Derose,et al.  LuxArray, a High-Density, Genomewide Transcription Analysis of Escherichia coli Using Bioluminescent Reporter Strains , 2001, Journal of bacteriology.

[20]  A. Khodursky,et al.  Nitrogen regulatory protein C-controlled genes of Escherichia coli: scavenging as a defense against nitrogen limitation. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[21]  F. Blattner,et al.  Functional Genomics: Expression Analysis ofEscherichia coli Growing on Minimal and Rich Media , 1999, Journal of bacteriology.

[22]  David Page,et al.  A Probabilistic Learning Approach to Whole-Genome Operon Prediction , 2000, ISMB.