Identifying the promoter features governing differential kinetics of co-regulated genes using fuzzy expressions

One of the biggest challenges in genomics is the elucidation of the design principles controlling gene expression. Current approaches examine promoter sequences for particular features, such as the presence of binding sites for a transcriptional regulator, and identify recurrent relationships among these features termed network motifs. To define the expression dynamics of a group of genes, the strength of the connections in a network must be specified, and these are determined by the cis-promoter features participating in the regulation. Approaches that homogenize features among promoters (e.g., relying on consensuses to describe the various promoter features) and even across species hamper the discovery of the key differences that distinguish promoters that are co-regulated by the same transcriptional regulator. Thus, we have developed a an approach based on fuzzy logic expressions to analyze proteobacterial genomes for promoter features that is specifically designed to account for the variability in sequence, location and topology intrinsic to differential gene expression. We applied our method to characterize network motifs controlled by the PhoP/PhoQ regulatory system of Escherichia coli and Salmonella enterica serovar Typhimurium. We identify key features that enable the PhoP protein to produce distinct kinetic patterns in target genes, which could not have been uncovered just by inspecting network motifs.

[1]  Julio Collado-Vides,et al.  RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12 , 2004, Nucleic Acids Res..

[2]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[3]  A. Hoffmann,et al.  One nucleotide in a kappaB site can determine cofactor specificity for NF-kappaB dimers. , 2004, Cell.

[4]  F. C. Soncini,et al.  Molecular Characterization of the Mg2+-Responsive PhoP-PhoQ Regulon in Salmonella enterica , 2003, Journal of bacteriology.

[5]  M. Eisen,et al.  Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering , 2002, Genome Biology.

[6]  G. Church,et al.  A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. , 1998, Journal of molecular biology.

[7]  E. Ruspini,et al.  Automated qualitative description of measurements , 1999, IMTC/99. Proceedings of the 16th IEEE Instrumentation and Measurement Technology Conference (Cat. No.99CH36309).

[8]  Julio Collado-Vides,et al.  Evaluation of thresholds for the detection of binding sites for regulatory proteins in Escherichia coli K12 DNA , 2002, Genome Biology.

[9]  Michio Sugeno,et al.  A fuzzy-logic-based approach to qualitative modeling , 1993, IEEE Trans. Fuzzy Syst..

[10]  Igor Zwir,et al.  Dissecting the PhoP regulatory network of Escherichia coli and Salmonella enterica. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[11]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[12]  Luis Herrera,et al.  A Hybrid Promoter Analysis Methodology for Prokaryotic Genomes , 2009, Fuzzy Systems in Bioinformatics and Computational Biology.

[13]  Igor Zwir,et al.  AUTOMATED GENERATION OF QUALITATIVE REPRESENTATIONS OF COMPLEX OBJECTS BY HYBRID SOFT-COMPUTING METHODS , 2001 .

[14]  Stephen Busby,et al.  Regulation at complex bacterial promoters: how bacteria use different promoter organizations to produce different regulatory outcomes. , 2004, Current opinion in microbiology.

[15]  David Baltimore,et al.  One Nucleotide in a κB Site Can Determine Cofactor Specificity for NF-κB Dimers , 2004, Cell.

[16]  Eduardo A. Groisman,et al.  The Pleiotropic Two-Component Regulatory System PhoP-PhoQ , 2001, Journal of bacteriology.

[17]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition : methods that search for structures in data , 1992 .

[18]  Akinori Kato,et al.  Connecting two-component regulatory systems by a protein that protects a response regulator from dephosphorylation by its cognate sensor. , 2004, Genes & development.

[19]  Hirotada Mori,et al.  Identification and Molecular Characterization of the Mg2+ Stimulon of Escherichia coli , 2003, Journal of bacteriology.

[20]  Igor Zwir,et al.  GENERALIZED ANALYSIS OF PROMOTERS: A METHOD FOR DNA SEQUENCE DESCRIPTION , 2004 .

[21]  Henry Huang,et al.  Analysis of differentially-regulated genes within a regulatory network by GPS genome navigation , 2005, Bioinform..

[22]  I Zwir,et al.  Automated Biological Sequence Description by Genetic Multiobjective Generalized Clustering , 2002, Annals of the New York Academy of Sciences.