LLM3D: a log-linear modeling-based method to predict functional gene regulatory interactions from genome-wide expression data

All cellular processes are regulated by condition-specific and time-dependent interactions between transcription factors and their target genes. While in simple organisms, e.g. bacteria and yeast, a large amount of experimental data is available to support functional transcription regulatory interactions, in mammalian systems reconstruction of gene regulatory networks still heavily depends on the accurate prediction of transcription factor binding sites. Here, we present a new method, log-linear modeling of 3D contingency tables (LLM3D), to predict functional transcription factor binding sites. LLM3D combines gene expression data, gene ontology annotation and computationally predicted transcription factor binding sites in a single statistical analysis, and offers a methodological improvement over existing enrichment-based methods. We show that LLM3D successfully identifies novel transcriptional regulators of the yeast metabolic cycle, and correctly predicts key regulators of mouse embryonic stem cell self-renewal more accurately than existing enrichment-based methods. Moreover, in a clinically relevant in vivo injury model of mammalian neurons, LLM3D identified peroxisome proliferator-activated receptor γ (PPARγ) as a neuron-intrinsic transcriptional regulator of regenerative axon growth. In conclusion, LLM3D provides a significant improvement over existing methods in predicting functional transcription regulatory interactions in the absence of experimental transcription factor binding data.

[1]  Pooja Jain,et al.  The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae , 2005, Nucleic Acids Res..

[2]  Obi L. Griffith,et al.  cisRED: a database system for genome-scale computational discovery of regulatory elements , 2005, Nucleic Acids Res..

[3]  Trey Ideker,et al.  Integrated Assessment and Prediction of Transcription Factor Binding , 2006, PLoS Comput. Biol..

[4]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[5]  Fangxue Sherry He,et al.  Systematic identification of mammalian regulatory motifs' target genes and functions , 2008, Nature Methods.

[6]  Hedi Peterson,et al.  g:Profiler—a web-based toolset for functional profiling of gene lists from large-scale experiments , 2007, Nucleic Acids Res..

[7]  Ting Wang,et al.  An improved map of conserved regulatory sites for Saccharomyces cerevisiae , 2006, BMC Bioinformatics.

[8]  Ronald Christensen,et al.  Log-Linear Models and Logistic Regression , 1997 .

[9]  E. Shooter,et al.  Fatty acid binding protein is induced in neurons of the dorsal root ganglia after peripheral nerve injury , 1996, Journal of neuroscience research.

[10]  Jo‐Wen Liu,et al.  Expression of E‐FABP in PC12 cells increases neurite extension during differentiation: involvement of n‐3 and n‐6 fatty acids , 2008, Journal of neurochemistry.

[11]  W. Wahli,et al.  PPARs: transcriptional effectors of fatty acids and their derivatives , 2002, Cellular and Molecular Life Sciences CMLS.

[12]  G. Miglio,et al.  PPARγ stimulation promotes neurite outgrowth in SH-SY5Y human neuroblastoma cells , 2009, Neuroscience Letters.

[13]  A. Smit,et al.  NFIL3 and cAMP Response Element-Binding Protein Form a Transcriptional Feedforward Loop that Controls Neuronal Regeneration-Associated Gene Expression , 2009, The Journal of Neuroscience.

[14]  C. Bracken,et al.  The hypoxia-inducible factors: key transcriptional regulators of hypoxic responses , 2003, Cellular and Molecular Life Sciences CMLS.

[15]  Alexander E. Kel,et al.  MATCHTM: a tool for searching transcription factor binding sites in DNA sequences , 2003, Nucleic Acids Res..

[16]  David J. Reiss,et al.  Learning transcriptional networks from the integration of ChIP-chip and expression data in a non-parametric model , 2010, Bioinform..

[17]  Jason B. Ernst,et al.  Integrating multiple evidence sources to predict transcription factor binding in the human genome. , 2010, Genome research.

[18]  J. Carazo,et al.  GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists , 2007, Genome Biology.

[19]  Christian J Stoeckert,et al.  Clustering of genes into regulons using integrated modeling-COGRIM , 2007, Genome Biology.

[20]  H. Akaike A new look at the statistical model identification , 1974 .

[21]  Marianna Pensky,et al.  BATS: a Bayesian user-friendly software for Analyzing Time Series microarray experiments , 2008, BMC Bioinformatics.

[22]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[23]  J. Pronk,et al.  Transcription factor control of growth rate dependent genes in Saccharomyces cerevisiae: A three factor design , 2008, BMC Genomics.

[24]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[25]  W. Wong,et al.  ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells , 2009, Proceedings of the National Academy of Sciences.

[26]  D. McTigue Potential Therapeutic Targets for PPARγ after Spinal Cord Injury , 2008, PPAR research.

[27]  Lucas D. Ward,et al.  Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences , 2008, ISMB.

[28]  T. Bouldin,et al.  Fatty Acids from Degenerating Myelin Lipids Are Conserved and Reutilized for Myelin Synthesis During Regeneration in Peripheral Nerve , 1995, Journal of neurochemistry.

[29]  U. Alon Network motifs: theory and experimental approaches , 2007, Nature Reviews Genetics.

[30]  Martin Vingron,et al.  Predicting transcription factor affinities to DNA from a biophysical model , 2007, Bioinform..

[31]  L. Robson,et al.  Omega-3 polyunsaturated fatty acids increase the neurite outgrowth of rat sensory neurones throughout development and in aged animals , 2010, Neurobiology of Aging.

[32]  B. Ren,et al.  An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome , 2009, PloS one.

[33]  E. Wingender,et al.  MATCH: A tool for searching transcription factor binding sites in DNA sequences. , 2003, Nucleic acids research.

[34]  D. Resnick,et al.  Thiazolidinedione Class of Peroxisome Proliferator-Activated Receptor γ Agonists Prevents Neuronal Damage, Motor Dysfunction, Myelin Loss, Neuropathic Pain, and Inflammation after Spinal Cord Injury in Adult Rats , 2007, Journal of Pharmacology and Experimental Therapeutics.

[35]  B. De Moor,et al.  Toucan: deciphering the cis-regulatory logic of coregulated genes. , 2003, Nucleic acids research.

[36]  Marianna Pensky,et al.  Statistical Applications in Genetics and Molecular Biology A Bayesian Approach to Estimation and Testing in Time-course Microarray Experiments , 2011 .

[37]  Tong Wang,et al.  TF-finder: A software package for identifying transcription factors involved in biological processes using microarray data and existing knowledge base , 2010, BMC Bioinformatics.

[38]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[39]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[40]  Moriah L. Szpara,et al.  Analysis of gene expression during neurite outgrowth and regeneration , 2007, BMC Neuroscience.

[41]  R. Dingledine,et al.  Expression of sensory neuron antigens by a dorsal root ganglion cell line, F-11. , 1990, Brain research. Developmental brain research.

[42]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[43]  A. Lash,et al.  The PPAR gamma agonist Pioglitazone improves anatomical and locomotor recovery after rodent spinal cord injury , 2007, Experimental Neurology.

[44]  Seon-Young Kim,et al.  Gene-set approach for expression pattern analysis , 2008, Briefings Bioinform..

[45]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[46]  Francisco Tirado,et al.  GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information , 2009, Nucleic Acids Res..

[47]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[48]  R. Bachoo,et al.  A Molecular Mechanism for Ibuprofen-Mediated RhoA Inhibition in Neurons , 2010, The Journal of Neuroscience.

[49]  M. Fishman,et al.  Neuronal traits of clonal cell lines derived by fusion of dorsal root ganglia neurons with neuroblastoma cells. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[50]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[51]  Michael Costigan,et al.  Replicate high-density rat genome oligonucleotide microarrays reveal hundreds of regulated genes in the dorsal root ganglion after peripheral nerve injury. , 2002, BMC Neuroscience.

[52]  W. Wong,et al.  A gene regulatory network in mouse embryonic stem cells , 2007, Proceedings of the National Academy of Sciences.

[53]  J. Noth,et al.  Identification of regeneration-associated genes after central and peripheral nerve injury in the adult rat , 2003, BMC Neuroscience.

[54]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[55]  E. Furlong,et al.  Combinatorial binding predicts spatio-temporal cis-regulatory activity , 2009, Nature.

[56]  M. Fishman,et al.  Neurochemical Characteristics of a Novel Dorsal Root Ganglion X Neuroblastoma Hybrid Cell Line, F‐11 , 1987, Journal of neurochemistry.

[57]  Haeyoung Suh-Kim,et al.  Neurite Outgrowth Induced by Cyclic AMP Can Be Modulated by the α Subunit of Go , 2000 .

[58]  Y D Lee,et al.  Neurite outgrowth induced by cyclic AMP can be modulated by the alpha subunit of Go. , 2000, Journal of neurochemistry.

[59]  A. Smit,et al.  Identification of candidate transcriptional modulators involved in successful regeneration after nerve injury , 2007, The European journal of neuroscience.

[60]  Gordon K. Smyth,et al.  A comparison of background correction methods for two-colour microarrays , 2007, Bioinform..

[61]  Sridhar Hannenhalli,et al.  Eukaryotic transcription factor binding sites - modeling and integrative search methods , 2008, Bioinform..

[62]  A. Kudlicki,et al.  Logic of the Yeast Metabolic Cycle: Temporal Compartmentalization of Cellular Processes , 2005, Science.

[63]  S. Ranade,et al.  Stem cell transcriptome profiling via massive-scale mRNA sequencing , 2008, Nature Methods.

[64]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..