Integrative inference of gene-regulatory networks in Escherichia coli using information theoretic concepts and sequence analysis

BackgroundAlthough Escherichia coli is one of the best studied model organisms, a comprehensive understanding of its gene regulation is not yet achieved. There exist many approaches to reconstruct regulatory interaction networks from gene expression experiments. Mutual information based approaches are most useful for large-scale network inference.ResultsWe used a three-step approach in which we combined gene regulatory network inference based on directed information (DTI) and sequence analysis. DTI values were calculated on a set of gene expression profiles from 19 time course experiments extracted from the Many Microbes Microarray Database. Focusing on influences between pairs of genes in which one partner encodes a transcription factor (TF) we derived a network which contains 878 TF - gene interactions of which 166 are known according to RegulonDB. Afterward, we selected a subset of 109 interactions that could be confirmed by the presence of a phylogenetically conserved binding site of the respective regulator. By this second step, the fraction of known interactions increased from 19% to 60%. In the last step, we checked the 44 of the 109 interactions not yet included in RegulonDB for functional relationships between the regulator and the target and, thus, obtained ten TF - target gene interactions. Five of them concern the regulator LexA and have already been reported in the literature. The remaining five influences describe regulations by Fis (with two novel targets), PhdR, PhoP, and KdgR. For the validation of our approach, one of them, the regulation of lipoate synthase (LipA) by the pyruvate-sensing pyruvate dehydrogenate repressor (PdhR), was experimentally checked and confirmed.ConclusionsWe predicted a set of five novel TF - target gene interactions in E. coli. One of them, the regulation of lipA by the transcriptional regulator PdhR was validated experimentally. Furthermore, we developed DTInfer, a new R-package for the inference of gene-regulatory networks from microarrays using directed information.

[1]  J. S. Parkinson,et al.  Liberation of an interaction domain from the phosphotransfer region of CheA, a signaling kinase of Escherichia coli. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[2]  A. D. de Koning,et al.  Effects of Fis on Escherichia coli gene expression during different growth stages. , 2007, Microbiology.

[3]  Søren Molin,et al.  Global impact of mature biofilm lifestyle on Escherichia coli K‐12 gene expression , 2003, Molecular microbiology.

[4]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[5]  Y. Takahashi,et al.  Genetic analysis of the isc operon in Escherichia coli involved in the biogenesis of cellular iron-sulfur proteins. , 2001, Journal of biochemistry.

[6]  Akira Ishihama,et al.  Novel mode of transcription regulation of divergently overlapping promoters by PhoP, the regulator of two‐component system sensing external magnesium availability , 2002, Molecular microbiology.

[7]  U. Alon,et al.  Just-in-time transcription program in metabolic pathways , 2004, Nature Genetics.

[8]  C. Yanisch-Perron,et al.  Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. , 1985, Gene.

[9]  J. D. Engel,et al.  Using directed information to build biologically relevant influence networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[10]  T. Kawula,et al.  Hsc66, an Hsp70 homolog in Escherichia coli, is induced by cold shock but not by heat shock , 1995, Journal of bacteriology.

[11]  C. Lawrence,et al.  Factors influencing the identification of transcription factor binding sites by cross-species comparison. , 2002, Genome research.

[12]  C. Daub,et al.  BMC Systems Biology , 2007 .

[13]  Kevin Kontos,et al.  Information-Theoretic Inference of Large Transcriptional Regulatory Networks , 2007, EURASIP J. Bioinform. Syst. Biol..

[14]  T. Conway,et al.  Multiple Regulators Control Expression of the Entner-Doudoroff Aldolase (Eda) of Escherichia coli , 2005, Journal of bacteriology.

[15]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[16]  J. Lengeler,et al.  Glucose Transporter Mutants of Escherichia coli K-12 with Changes in Substrate Recognition of IICBGlc and Induction Behavior of theptsG Gene , 2000, Journal of bacteriology.

[17]  Akira Ishihama,et al.  PdhR (Pyruvate Dehydrogenase Complex Regulator) Controls the Respiratory Electron Transport System in Escherichia coli , 2007, Journal of bacteriology.

[18]  Kenneth E. Rudd,et al.  EcoGene: a genome sequence database for Escherichia coli K-12 , 2000, Nucleic Acids Res..

[19]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[20]  M. Lewis,et al.  The lac repressor. , 2005, Comptes rendus biologies.

[21]  Moon,et al.  Estimation of mutual information using kernel density estimators. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[22]  A. Travers,et al.  An architectural role of the Escherichia coli chromatin protein FIS in organising DNA. , 2001, Nucleic acids research.

[23]  J. R. Guest,et al.  Lipoic acid content of Escherichia coli and other microorganisms , 1975, Archives of Microbiology.

[24]  Jacques van Helden,et al.  Regulatory Sequence Analysis Tools , 2003, Nucleic Acids Res..

[25]  J. Massey CAUSALITY, FEEDBACK AND DIRECTED INFORMATION , 1990 .

[26]  S. Sedgwick,et al.  Interspecies regulation of the SOS response by the E. coli lexA+ gene. , 1985, Mutation research.

[27]  Carsten O. Daub,et al.  Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data , 2004, BMC Bioinformatics.

[28]  N. M. Kredich,et al.  The molecular basis for positive regulation of cys promoters in Salmonella typhimurium and Escherichia coli , 1992, Molecular microbiology.

[29]  Julio Collado-Vides,et al.  RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation , 2007, Nucleic Acids Res..

[30]  Sunduz Keles,et al.  Statistical Applications in Genetics and Molecular Biology Supervised Detection of Conserved Motifs in DNA Sequences with Cosmo , 2011 .

[31]  F. Blattner,et al.  IscR‐dependent gene expression links iron‐sulphur cluster assembly to the control of O2‐regulated genes in Escherichia coli , 2006, Molecular microbiology.

[32]  Jeremiah J. Faith,et al.  Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata , 2007, Nucleic Acids Res..

[33]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[34]  F. Grund Forsythe, G. E. / Malcolm, M. A. / Moler, C. B., Computer Methods for Mathematical Computations. Englewood Cliffs, New Jersey 07632. Prentice Hall, Inc., 1977. XI, 259 S , 1979 .

[35]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[36]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[37]  Kevin Struhl,et al.  Genomic analysis of LexA binding reveals the permissive nature of the Escherichia coli genome and identifies unconventional target sites. , 2005, Genes & development.

[38]  R. Woodgate,et al.  Identification of additional genes belonging to the LexA regulon in Escherichia coli , 2000, Molecular microbiology.

[39]  Peter L. Lee,et al.  The dinB Operon and Spontaneous Mutation in Escherichiacoli , 2003, Journal of bacteriology.

[40]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[41]  B. Seaton,et al.  A gene encoding a DnaK/hsp70 homolog in Escherichia coli. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[42]  U. Alon,et al.  Ordering Genes in a Flagella Pathway by Analysis of Expression Kinetics from Living Bacteria , 2001, Science.

[43]  Y. Takahashi,et al.  Functional assignment of the ORF2-iscS-iscU-iscA-hscB-hscA-fdx-ORF3 gene cluster involved in the assembly of Fe-S clusters in Escherichia coli. , 1999, Journal of biochemistry.

[44]  Michael A. Malcolm,et al.  Computer methods for mathematical computations , 1977 .

[45]  E. Díaz,et al.  Catabolism of Phenylacetic Acid in Escherichia coli , 1998, The Journal of Biological Chemistry.

[46]  J. Liu,et al.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. , 2001, Nucleic acids research.

[47]  Byung-Kwan Cho,et al.  Genome-wide analysis of Fis binding in Escherichia coli indicates a causative role for A-/AT-tracts. , 2008, Genome research.