RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12

Abstract RegulonDB, first published 20 years ago, is a comprehensive electronic resource about regulation of transcription initiation of Escherichia coli K-12 with decades of knowledge from classic molecular biology experiments, and recently also from high-throughput genomic methodologies. We curated the literature to keep RegulonDB up to date, and initiated curation of ChIP and gSELEX experiments. We estimate that current knowledge describes between 10% and 30% of the expected total number of transcription factor- gene regulatory interactions in E. coli. RegulonDB provides datasets for interactions for which there is no evidence that they affect expression, as well as expression datasets. We developed a proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions. These datasets can be visualized in the RegulonDB JBrowse. We developed the Microbial Conditions Ontology with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. At a higher level of integration, we report Genetic Sensory-Response Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for 70 of them. Finally, we summarize our research with Natural language processing strategies to enhance our biocuration work.

[1]  J. Collado-Vides,et al.  The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12. , 2000, Nucleic acids research.

[2]  Julio Collado-Vides,et al.  The role of DNA-binding specificity in the evolution of bacterial regulatory networks. , 2008, Journal of molecular biology.

[3]  Peter D. Karp,et al.  The EcoCyc database: reflecting new knowledge about Escherichia coli K-12 , 2016, Nucleic Acids Res..

[4]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[5]  Julio Collado-Vides,et al.  First steps in automatic summarization of transcription factor properties for RegulonDB: classification of sentences about structural domains and regulated processes , 2017, Database J. Biol. Databases Curation.

[6]  Nicholas M. Luscombe,et al.  Direct and indirect effects of H-NS and Fis on global gene expression control in Escherichia coli , 2010, Nucleic acids research.

[7]  Julio Collado-Vides,et al.  RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more , 2012, Nucleic Acids Res..

[8]  Julio Collado-Vides,et al.  A unified resource for transcriptional regulation in Escherichia coli K-12 incorporating high-throughput-generated binding data into RegulonDB version 10.0 , 2018, BMC Biology.

[9]  Fabio Rinaldi,et al.  Assisted curation of regulatory interactions and growth conditions of OxyR in E. coli K-12 , 2014, Database J. Biol. Databases Curation.

[10]  N. Kikuchi,et al.  CellDesigner 3.5: A Versatile Modeling Tool for Biochemical Networks , 2008, Proceedings of the IEEE.

[11]  Araceli M. Huerta,et al.  From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. , 1998, BioEssays : news and reviews in molecular, cellular and developmental biology.

[12]  Richard S. Sandstrom,et al.  BEDOPS: high-performance genomic feature operations , 2012, Bioinform..

[13]  Denis Thieffry,et al.  RegulonDB: a database on transcriptional regulation in Escherichia coli , 1998, Nucleic Acids Res..

[14]  Julio Collado-Vides,et al.  MCO: towards an ontology and unified vocabulary for a framework‐based annotation of microbial growth conditions , 2018, Bioinform..

[15]  F. Neidhardt,et al.  Physiology of the bacterial cell : a molecular approach , 1990 .

[16]  Edward J. O'Brien,et al.  Deciphering Fur transcriptional regulatory network highlights its complex role beyond iron metabolism in Escherichia coli , 2014, Nature Communications.

[17]  Gabriela I. Guzmán,et al.  Systems assessment of transcriptional regulation on central carbon metabolism by Cra and CRP , 2016, bioRxiv.

[18]  Matthias Heinemann,et al.  Assessment of the interaction between the flux‐signaling metabolite fructose‐1,6‐bisphosphate and the bacterial transcription factors CggR and Cra , 2018, Molecular microbiology.

[19]  Michael Y. Galperin,et al.  Expanded microbial genome coverage and improved protein family annotation in the COG database , 2014, Nucleic Acids Res..

[20]  N. Luscombe,et al.  Principles of transcriptional regulation and evolution of the metabolic system in E. coli. , 2009, Genome research.

[21]  Fabio Rinaldi,et al.  RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond , 2015, Nucleic Acids Res..

[22]  Julio Collado-Vides,et al.  Evidence classification of high-throughput protocols and confidence integration in RegulonDB , 2013, Database J. Biol. Databases Curation.

[23]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[24]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[25]  Irma Martínez-Flores,et al.  Using RegulonDB, the Escherichia coli K‐12 Gene Regulatory Transcriptional Network Database , 2018, Current protocols in bioinformatics.

[26]  Donghyuk Kim,et al.  Genome-wide Reconstruction of OxyR and SoxRS Transcriptional Regulatory Networks under Oxidative Stress in Escherichia coli K-12 MG1655. , 2015, Cell reports.

[27]  Denis Thieffry,et al.  RSAT 2018: regulatory sequence analysis tools 20th anniversary , 2018, Nucleic Acids Res..

[28]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[29]  Julio Collado-Vides,et al.  Genome-Wide Mapping of Transcriptional Regulation and Metabolism Describes Information-Processing Units in Escherichia coli , 2017, Front. Microbiol..

[30]  J. Collado-Vides,et al.  Functional architecture of Escherichia coli: new insights provided by a natural decomposition approach , 2008, Genome Biology.