Computational inference of transcriptional regulatory networks from expression profiling and transcription factor binding site identification.

We have developed a computational method for transcriptional regulatory network inference, CARRIE (Computational Ascertainment of Regu latory Relationships Inferred from Expression), which combines microarray and promoter sequence analysis. CARRIE uses sources of data to identify the transcription factors (TFs) that regulate gene expression changes in response to a stimulus and generates testable hypotheses about the regulatory network connecting these TFs to the genes they regulate. The promoter analysis component of CARRIE, ROVER (Relative OVER-abundance of cis-elements), is highly accurate at detecting the TFs that regulate the response to a stimulus. ROVER also predicts which genes are regulated by each of these TFs. CARRIE uses these transcriptional interactions to infer a regulatory network. To demonstrate our method, we applied CARRIE to six sets of publicly available DNA microarray experiments on Saccharomyces cerevisiae. The predicted networks were validated with comparisons to literature sources, experimental TF binding data, and gene ontology biological process information.

[1]  T. Hughes,et al.  Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. , 2000, Science.

[2]  Joshua M. Stuart,et al.  Conserved Genetic Modules 5 / 29 / 2003 1 A gene co-expression network for global discovery of conserved genetic modules in H . sapiens , D . melanogaster , C . elegans , and S . cerevisiae , 2003 .

[3]  P. Brown,et al.  New components of a system for phosphate accumulation and polyphosphate metabolism in Saccharomyces cerevisiae revealed by genomic expression analysis. , 2000, Molecular biology of the cell.

[4]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[5]  Roded Sharan,et al.  CREME: a framework for identifying cis-regulatory modules in human-mouse conserved segments , 2003, ISMB.

[6]  Alexander V. Lukashin,et al.  Local multiple sequence alignment using dead-end elimination , 1999, Bioinform..

[7]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[8]  Uri Keich,et al.  Finding motifs in the twilight zone , 2002, RECOMB '02.

[9]  Holger H. Hoos,et al.  Inference of transcriptional regulation relationships from gene expression data , 2003, SAC '03.

[10]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[11]  John J. Wyrick,et al.  Genome-wide location and function of DNA binding proteins. , 2000, Science.

[12]  Jaak Vilo,et al.  Building and analysing genome-wide gene disruption networks , 2002, ECCB.

[13]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[14]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[15]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[16]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[17]  Z. Weng,et al.  Finding functional sequence elements by multiple local alignment. , 2004, Nucleic acids research.

[18]  Richard C. McEachin,et al.  Computationally Identifying Novel NF-κB-Regulated Immune Genes in the Human Genome , 2003 .

[19]  A. Brazma,et al.  Towards reconstruction of gene networks from expression data by supervised learning , 2003, Genome Biology.

[20]  Jiashun Zheng,et al.  An approach to identify over-represented cis-elements in related sequences. , 2003, Nucleic acids research.

[21]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[22]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[23]  G. Church,et al.  Computational identification of transcription factor binding sites via a transcription-factor-centric clustering (TFCC) algorithm. , 2002, Journal of molecular biology.

[24]  Charles Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Mach. Learn..

[25]  Paul Horton Tsukuba BB: A Branch and Bound Algorithm for Local Multiple Alignment of DNA and Protein Sequences , 2001, J. Comput. Biol..

[26]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[27]  Holger H. Hoos,et al.  Inference of Transcriptional Regulation Relationships from Gene Expression Data , 2003, Bioinform..

[28]  Rongxiang Liu,et al.  Computationally identifying novel NF-kappa B-regulated immune genes in the human genome. , 2003, Genome research.

[29]  Y. Oshima The phosphatase system in Saccharomyces cerevisiae. , 1997, Genes & genetic systems.

[30]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[31]  B. De Moor,et al.  Toucan: deciphering the cis-regulatory logic of coregulated genes. , 2003, Nucleic acids research.

[32]  Kathleen Marchal,et al.  A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling , 2001, Bioinform..

[33]  D. Shasha,et al.  cis element/transcription factor analysis (cis/TF): a method for discovering transcription factor/cis element relationships. , 2001, Genome research.

[34]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[35]  Masato Ishikawa,et al.  Automatic extraction of motifs represented in the hidden Markov model from a number of DNA sequences , 1998, Bioinform..

[36]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[37]  I. Jonassen,et al.  Predicting gene regulatory elements in silico on a genomic scale. , 1998, Genome research.

[38]  Esko Ukkonen,et al.  Correlating gene promoters and expression in gene disruption experiments , 2002, ECCB.

[39]  Xin Chen,et al.  TRANSFAC: an integrated system for gene expression regulation , 2000, Nucleic Acids Res..

[40]  K M Kyoda,et al.  A gene network inference method from continuous-value gene expression data of wild-type and mutants. , 2000, Genome informatics. Workshop on Genome Informatics.

[41]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[42]  R. J. Cho,et al.  Candidate regulatory sequence elements for cell cycle-dependent transcription in Saccharomyces cerevisiae. , 1999, Genome research.

[43]  Jan O. Korbel,et al.  Combining frequency and positional information to predict transcription factor binding sites , 2001, Bioinform..

[44]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[45]  D. Botstein,et al.  Genome-wide characterization of the Zap1p zinc-responsive regulon in yeast. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[46]  R. Sharan,et al.  Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. , 2003, Genome research.

[47]  Haidong Wang,et al.  Discovering molecular pathways from protein interaction and gene expression data , 2003, ISMB.