Mapping functional transcription factor networks from gene expression data

A critical step in understanding how a genome functions is determining which transcription factors (TFs) regulate each gene. Accordingly, extensive effort has been devoted to mapping TF networks. In Saccharomyces cerevisiae, protein-DNA interactions have been identified for most TFs by ChIP-chip, and expression profiling has been done on strains deleted for most TFs. These studies revealed that there is little overlap between the genes whose promoters are bound by a TF and those whose expression changes when the TF is deleted, leaving us without a definitive TF network for any eukaryote and without an efficient method for mapping functional TF networks. This paper describes NetProphet, a novel algorithm that improves the efficiency of network mapping from gene expression data. NetProphet exploits a fundamental observation about the nature of TF networks: The response to disrupting or overexpressing a TF is strongest on its direct targets and dissipates rapidly as it propagates through the network. Using S. cerevisiae data, we show that NetProphet can predict thousands of direct, functional regulatory interactions, using only gene expression data. The targets that NetProphet predicts for a TF are at least as likely to have sites matching the TF's binding specificity as the targets implicated by ChIP. Unlike most ChIP targets, the NetProphet targets also show evidence of functional regulation. This suggests a surprising conclusion: The best way to begin mapping direct, functional TF-promoter interactions may not be by measuring binding. We also show that NetProphet yields new insights into the functions of several yeast TFs, including a well-studied TF, Cbf1, and a completely unstudied TF, Eds1.

[1]  D. Fraenkel,et al.  Glycolysis mutants in Saccharomyces cerevisiae. , 1978, Genetics.

[2]  M. Goebl,et al.  Yeast bZip proteins mediate pleiotropic drug and metal resistance. , 1993, The Journal of biological chemistry.

[3]  Terrance G. Cooper,et al.  Complilation and characteristics of dedicated transcription factors in Saccharomyces cerevisiae , 1995 .

[4]  T. Cooper,et al.  Review: compilation and characteristics of dedicated transcription factors in Saccharomyces cerevisiae. , 1995, Yeast.

[5]  D. Thomas,et al.  A heteromeric complex containing the centromere binding factor 1 and two basic leucine zipper factors, Met4 and Met28, mediates the transcription activation of yeast sulfur metabolism. , 1996, The EMBO journal.

[6]  M Aldea,et al.  A Set of Vectors with a Tetracycline‐Regulatable Promoter System for Modulated Gene Expression in Saccharomyces cerevisiae , 1997, Yeast.

[7]  M. Tyers,et al.  Regulation of the mating pheromone and invasive growth responses in yeast by two MAP kinase substrates , 1997, Current Biology.

[8]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[9]  S. Kohlwein,et al.  Saccharomyces cerevisiae gene ISW2 encodes a microtubule‐interacting protein required for premeiotic DNA replication , 2000, Yeast.

[10]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[11]  P. Brown,et al.  New components of a system for phosphate accumulation and polyphosphate metabolism in Saccharomyces cerevisiae revealed by genomic expression analysis. , 2000, Molecular biology of the cell.

[12]  M. Gerstein,et al.  Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae. , 2002, Genes & development.

[13]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[14]  Gavin Sherlock,et al.  The Longhorn Array Database (LAD): An Open-Source, MIAME compliant implementation of the Stanford Microarray Database (SMD) , 2003, BMC Bioinformatics.

[15]  M. Gerstein,et al.  Structure and evolution of transcriptional regulatory networks. , 2004, Current opinion in structural biology.

[16]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[17]  S. Teichmann,et al.  Gene regulatory network growth by duplication , 2004, Nature Genetics.

[18]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[19]  M. Gerstein,et al.  Genomic analysis of regulatory network dynamics reveals large topological changes , 2004, Nature.

[20]  Doree Sitkoff,et al.  models homology modeling : From sequence alignments to structural A comparative study of available software for high-accuracy , 2005 .

[21]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[22]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[23]  Charles Boone,et al.  Identifying transcription factor functions and targets by phenotypic activation , 2006, Proceedings of the National Academy of Sciences.

[24]  L. Aravind,et al.  Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast. , 2006, Journal of molecular biology.

[25]  T. Hughes,et al.  Mapping pathways and phenotypes by systematic gene overexpression. , 2006, Molecular cell.

[26]  F. V. van Werven,et al.  The use of biotin tagging in Saccharomyces cerevisiae improves the sensitivity of chromatin immunoprecipitation , 2006, Nucleic acids research.

[27]  Trey Ideker,et al.  Integrated Assessment and Prediction of Transcription Factor Binding , 2006, PLoS Comput. Biol..

[28]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[29]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[30]  Patrick J. Killion,et al.  Genetic reconstruction of a functional transcriptional regulatory network , 2007, Nature Genetics.

[31]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[32]  Pascale Daran-Lapujade,et al.  Saccharomyces cerevisiae SFP1: at the crossroads of central metabolism and ribosome biogenesis. , 2008, Microbiology.

[33]  Jan Komorowski,et al.  Combinatorial control of gene expression by the three yeast repressors Mig1, Mig2 and Mig3 , 2008, BMC Genomics.

[34]  Raluca Gordân,et al.  Distinguishing direct versus indirect transcription factor-DNA interactions. , 2009, Genome research.

[35]  Gustavo Stolovitzky,et al.  Lessons from the DREAM2 Challenges , 2009, Annals of the New York Academy of Sciences.

[36]  Daniel E. Newburger,et al.  High-resolution DNA-binding specificity analysis of yeast transcription factors. , 2009, Genome research.

[37]  Andrea Califano,et al.  Lessons from the DREAM 2 Challenges A Community Effort to Assess Biological Network Inference , 2009 .

[38]  I. Simon,et al.  Backup in gene regulatory networks explains differences between binding and knockout results , 2009, Molecular systems biology.

[39]  D. Floreano,et al.  Replaying the Evolutionary Tape: Biomimetic Reverse Engineering of Gene Networks , 2009, Annals of the New York Academy of Sciences.

[40]  N. D. Clarke,et al.  Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges , 2010, PloS one.

[41]  Juan M. Vaquerizas,et al.  Multiplexed massively parallel SELEX for characterization of human transcription factor binding specificities. , 2010, Genome research.

[42]  Richard Bonneau,et al.  DREAM3: Network Inference Using Dynamic Context Likelihood of Relatedness and the Inferelator , 2010, PloS one.

[43]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[44]  Kevin Y. Yip,et al.  Improved Reconstruction of In Silico Gene Regulatory Networks by Integrating Knockout and Perturbation Data , 2010, PloS one.

[45]  A. Mitchell,et al.  Interaction of Cryptococcus neoformans Rim101 and Protein Kinase A Regulates Capsule , 2010, PLoS pathogens.

[46]  A. G. de la Fuente,et al.  From Knockouts to Networks: Establishing Direct Cause-Effect Relationships through Graph Analysis , 2010, PloS one.

[47]  Richard Bonneau,et al.  DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models , 2010, PloS one.

[48]  Juan M. Vaquerizas,et al.  Comprehensive reanalysis of transcription factor knockout expression data in Saccharomyces cerevisiae reveals many new targets , 2010, Nucleic acids research.

[49]  Alexandre P. Francisco,et al.  YEASTRACT: providing a programmatic access to curated transcriptional regulatory associations in Saccharomyces cerevisiae through a web services interface , 2010, Nucleic Acids Res..

[50]  E. O’Shea,et al.  Integrated approaches reveal determinants of genome-wide binding and function of the transcription factor Pho4. , 2011, Molecular cell.

[51]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[52]  Martha L. Bulyk,et al.  UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein–DNA interactions , 2010, Nucleic Acids Res..

[53]  Raluca Gordân,et al.  Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights , 2011, Genome Biology.

[54]  Bryan Lajoie,et al.  Enhanced yeast one-hybrid (eY1H) assays for high-throughput gene-centered regulatory network mapping , 2011, Nature Methods.

[55]  Marc D. Perry,et al.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia , 2012, Genome research.

[56]  M. Johnston,et al.  “Calling Cards” for DNA-Binding Proteins in Mammalian Cells , 2012, Genetics.

[57]  Gary D. Stormo,et al.  ScerTF: a comprehensive database of benchmarked position weight matrices for Saccharomyces species , 2011, Nucleic Acids Res..

[58]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.