Adaptively inferring human transcriptional subnetworks

Although the human genome has been sequenced, progress in understanding gene regulation in humans has been particularly slow. Many computational approaches developed for lower eukaryotes to identify cis‐regulatory elements and their associated target genes often do not generalize to mammals, largely due to the degenerate and interactive nature of such elements. Motivated by the switch‐like behavior of transcriptional responses, we present a systematic approach that allows adaptive determination of active transcriptional subnetworks (cis‐motif combinations, the direct target genes and physiological processes regulated by the corresponding transcription factors) from microarray data in mammals, with accuracy similar to that achieved in lower eukaryotes. Our analysis uncovered several new subnetworks active in human liver and in cell‐cycle regulation, with similar functional characteristics as the known ones. We present biochemical evidence for our predictions, and show that the recently discovered G2/M‐specific E2F pathway is wider than previously thought; in particular, E2F directly activates certain mitotic genes involved in hepatocellular carcinomas. Additionally, we demonstrate that this method can predict subnetworks in a condition‐specific manner, as well as regulatory crosstalk across multiple tissues. Our approach allows systematic understanding of how phenotypic complexity is regulated at the transcription level in mammals and offers marked advantage in systems where little or no prior knowledge of transcriptional regulation is available.

[1]  Emmitt R. Jolly,et al.  Inference of combinatorial regulation in yeast transcriptional networks: a case study of sporulation. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  O. Pedersen,et al.  Expression of protein‐tyrosine phosphatases in the major insulin target tissues , 1997, FEBS letters.

[3]  May D. Wang,et al.  GoMiner: a resource for biological interpretation of genomic and proteomic data , 2003, Genome Biology.

[4]  E. Kinney Primer of Biostatistics , 1987 .

[5]  Jun S. Liu,et al.  An algorithm for finding protein–DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments , 2002, Nature Biotechnology.

[6]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[7]  M. Carey,et al.  The Enhanceosome and Transcriptional Synergy , 1998, Cell.

[8]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[9]  P. Kwok,et al.  Pharmacogenomic assessment of carboxylesterases 1 and 2. , 2004, Genomics.

[10]  Jun S. Liu,et al.  Integrating regulatory motif discovery and genome-wide expression analysis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  C. O’Farrelly,et al.  Having it all? Stem cells, haematopoiesis and lymphopoiesis in adult human liver , 2002, Immunology and cell biology.

[13]  T. Sakai,et al.  p53-independent induction of Gadd45 by histone deacetylase inhibitor: coordinate regulation by transcription factors Oct-1 and NF-Y , 2003, Oncogene.

[14]  Rafael A. Irizarry,et al.  Improved microarray methods for profiling the yeast knockout strain collection , 2005, Nucleic acids research.

[15]  C. Ball,et al.  Identification of genes periodically expressed in the human cell cycle and their expression in tumors. , 2002, Molecular biology of the cell.

[16]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[17]  Ramana V. Davuluri,et al.  Direct coupling of the cell cycle and cell death machinery by E2F , 2002, Nature Cell Biology.

[18]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[19]  L. Pennacchio,et al.  Genomic strategies to identify mammalian regulatory sequences , 2001, Nature Reviews Genetics.

[20]  Michael Q. Zhang,et al.  Genome-wide promoter extraction and analysis in human, mouse, and rat , 2005, Genome Biology.

[21]  K. Chen,et al.  Transcription factors and the down-regulation of G1/S boundary genes in human diploid fibroblasts during senescence. , 1997, Frontiers in bioscience : a journal and virtual library.

[22]  K. Helin,et al.  CDC25A Phosphatase Is a Target of E2F and Is Required for Efficient E2F-Induced S Phase , 1999, Molecular and Cellular Biology.

[23]  R. Veitia,et al.  A sigmoidal transcriptional response: cooperativity, synergy and dosage effects , 2003, Biological reviews of the Cambridge Philosophical Society.

[24]  P. Farnham,et al.  Genomic Approaches That Aid in the Identification of Transcription Factor Target 2004 , 2004 .

[25]  Chi-Ying F. Huang,et al.  Identification of a novel cell cycle regulated gene, HURP, overexpressed in human hepatocellular carcinoma , 2003, Oncogene.

[26]  J. Nevins,et al.  Temporal Control of Cell Cycle Gene Expression Mediated by E2F Transcription Factors , 2005, Cell cycle.

[27]  P. V. von Hippel,et al.  Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. , 1987, Journal of molecular biology.

[28]  S. Hiebert,et al.  Indirect and direct disruption of transcriptional regulation in cancer: E2F and AML-1. , 1995, Critical reviews in eukaryotic gene expression.

[29]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  T. Maniatis,et al.  An extensive network of coupling among gene expression machines , 2002, Nature.

[31]  Masakazu Ueda,et al.  TFDP1, CUL4A, and CDC16 identified as targets for amplification at 13q34 in hepatocellular carcinomas , 2002, Hepatology.

[32]  James R. Downing,et al.  Expression of the AML-1 Oncogene Shortens the G1Phase of the Cell Cycle* , 2000, The Journal of Biological Chemistry.

[33]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[34]  K. Ichikawa,et al.  Functional Interaction between Oct-1 and Retinoid X Receptor* , 1999, The Journal of Biological Chemistry.

[35]  Alexander E. Kel,et al.  MATCHTM: a tool for searching transcription factor binding sites in DNA sequences , 2003, Nucleic Acids Res..

[36]  R. Tjian,et al.  Transcription regulation and animal diversity , 2003, Nature.

[37]  E. Wingender,et al.  MATCH: A tool for searching transcription factor binding sites in DNA sequences. , 2003, Nucleic acids research.

[38]  E. Norwitz,et al.  Oct-1 and nuclear factor Y bind to the SURG-1 element to direct basal and gonadotropin-releasing hormone (GnRH)-stimulated mouse GnRH receptor gene transcription. , 2005, Molecular endocrinology.

[39]  Lihua Liu,et al.  TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies , 2004, Nucleic Acids Res..

[40]  J. Shendure,et al.  Discovering functional transcription-factor combinations in the human cell cycle. , 2005, Genome research.

[41]  R. Grosschedl,et al.  Regulation of LEF-1/TCF transcription factors by Wnt and other signals. , 1999, Current opinion in cell biology.

[42]  Michael Q. Zhang,et al.  Mining ChIP-chip data for transcription factor and cofactor binding sites , 2005, ISMB.

[43]  Michael Q. Zhang,et al.  Interacting models of cooperative gene regulation. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[44]  R. Costa,et al.  The Winged Helix Transcriptional Activator HFH-3 Is Expressed in the Distal Tubules of Embryonic and Adult Mouse Kidney* , 1997, The Journal of Biological Chemistry.

[45]  E. Knudsen,et al.  Aromatic Hydrocarbon Receptor Interaction with the Retinoblastoma Protein Potentiates Repression of E2F-dependent Transcription and Cell Cycle Arrest* , 2000, The Journal of Biological Chemistry.

[46]  Michael Q. Zhang,et al.  Similarity of position frequency matrices for transcription factor binding sites , 2005, Bioinform..

[47]  Michael B. Eisen,et al.  Identification of regulatory elements using a feature selection method , 2002, Bioinform..

[48]  C. Niehrs,et al.  Synexpression groups in eukaryotes , 1999, Nature.

[49]  R. Spang,et al.  Role for E2F in Control of Both DNA Replication and Mitotic Functions as Revealed from DNA Microarray Analysis , 2001, Molecular and Cellular Biology.

[50]  Nicola J. Rinaldi,et al.  Control of Pancreas and Liver Gene Expression by HNF Transcription Factors , 2004, Science.

[51]  E. Bresnick,et al.  Regulation of the constitutive expression of the human CYP1A2 gene: cis elements and their interactions with proteins. , 1995, Molecular pharmacology.

[52]  F. Sohl,et al.  Mars: An introduction , 2005 .

[53]  T. Yen,et al.  The ubiquitous transcription factor Oct-1 and the liver-specific factor HNF-1 are both required to activate transcription of a hepatitis B virus promoter , 1991, Molecular and cellular biology.

[54]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[55]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.

[56]  Martin C. Frith,et al.  Cluster-Buster: finding dense clusters of motifs in DNA sequences , 2003, Nucleic Acids Res..

[57]  R. Sharan,et al.  Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. , 2003, Genome research.

[58]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[59]  R A Laskey,et al.  S phase of the cell cycle. , 1989, Science.

[60]  Kiyoshi Ohtani,et al.  Cell growth-regulated expression of mammalian MCM5 and MCM6 genes mediated by the transcription factor E2F , 1999, Oncogene.