Inferring Gene Networks: Dream or Nightmare?

Inferring gene networks is a daunting task. We here describe several algorithms we used in the Dialogue for Reverse Engineering Assessments and Methods (DREAM2) Reverse Engineering Competition 2007: an algorithm based on first‐order partial correlation for discovering BCL6 targets in Challenge 1 and an algorithm using nonlinear optimization with winning performance in Challenge 3. After the gold standards for the challenges were released, the performance of alternative variants of the algorithms could be evaluated. The DREAM competition taught us some strong lessons. Amazingly, simpler methods performed in general better than more advanced, theoretically motivated approaches. Also, the challenges strongly showed that inferring gene networks requires controlled experimentation using a well‐defined experimental design. Analyzing data obtained through merging many unrelated datasets indeed resulted in weak performances of all algorithms, while algorithms that explicitly took the experimental design into account performed best.

[1]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[2]  D. Pe’er,et al.  Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification , 2006, Proceedings of the National Academy of Sciences.

[3]  Andreas Wagner,et al.  How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps , 2001, Bioinform..

[4]  M. Reinders,et al.  Genetic network modeling. , 2002, Pharmacogenomics.

[5]  G. Hannon,et al.  Processing of primary microRNAs by the Microprocessor complex , 2004, Nature.

[6]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[7]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[8]  Marcel J. T. Reinders,et al.  Linear Modeling of Genetic Networks from Experimental Data , 2000, ISMB.

[9]  A. V. Bentem,et al.  Deciphering living networks : Perturbation strategies for functional genomics , 2006 .

[10]  John J. Wyrick,et al.  Genome-wide location and function of DNA binding proteins. , 2000, Science.

[11]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[12]  R. Doerge,et al.  Global eQTL Mapping Reveals the Complex Genetic Architecture of Transcript-Level Variation in Arabidopsis , 2007, Genetics.

[13]  Roded Sharan,et al.  Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis , 2000, ISMB.

[14]  M. Gerstein,et al.  Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. , 2001, Journal of molecular biology.

[15]  Reginaldo J. Santos Equivalence of regularization and truncated iteration for general ill-posed problems☆ , 1996 .

[16]  Maria A Stalteri,et al.  Give me shelter: the global housing crisis. , 2003, BMC Bioinformatics.

[17]  L. Almasy,et al.  Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes , 2007, Nature Genetics.

[18]  V. Ambros,et al.  A short history of a short RNA , 2004, Cell.

[19]  Martin Kuiper,et al.  Genetic Analysis of Variation in Gene Expression in Arabidopsis thaliana , 2005, Genetics.

[20]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[21]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[22]  Carsten O. Daub,et al.  The mutual information: Detecting and evaluating dependencies between variables , 2002, ECCB.

[23]  Markus J. Herrgård,et al.  Reconciling gene expression data with known genome-scale regulatory network structures. , 2003, Genome research.

[24]  J. Lieb,et al.  ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. , 2004, Genomics.

[25]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[26]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[27]  A. G. de la Fuente,et al.  Quantifying Gene Networks with Regulatory Strengths , 2004, Molecular Biology Reports.

[28]  P. Brazhnik,et al.  Linking the genes: inferring quantitative gene networks from microarray data. , 2002, Trends in genetics : TIG.

[29]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[30]  Jeremiah J. Faith,et al.  Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata , 2007, Nucleic Acids Res..

[31]  P. Bühlmann,et al.  Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana , 2004, Genome Biology.

[32]  P. Bourgine,et al.  Topological and causal structure of the yeast transcriptional regulatory network , 2002, Nature Genetics.

[33]  Peter D. Karp,et al.  The comprehensive updated regulatory network of Escherichia coli K-12 , 2006, BMC Bioinformatics.

[34]  A. G. de la Fuente,et al.  Gene Network Inference via Structural Equation Modeling in Genetical Genomics Experiments , 2008, Genetics.

[35]  Ron Shamir,et al.  EXPANDER – an integrative program suite for microarray data analysis , 2005, BMC Bioinformatics.

[36]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[37]  P. Bühlmann,et al.  Statistical Applications in Genetics and Molecular Biology Low-Order Conditional Independence Graphs for Inferring Genetic Networks , 2011 .

[38]  Jingyuan Fu,et al.  Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci , 2007, Proceedings of the National Academy of Sciences.

[39]  Mario Bertero,et al.  Regularization methods in image restoration: An application to HST images , 1995, Int. J. Imaging Syst. Technol..

[40]  M. Gerstein,et al.  Genomic analysis of regulatory network dynamics reveals large topological changes , 2004, Nature.

[41]  John D. Storey A direct approach to false discovery rates , 2002 .

[42]  R. Sharan,et al.  CLICK: a clustering algorithm with applications to gene expression analysis. , 2000, Proceedings. International Conference on Intelligent Systems for Molecular Biology.

[43]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[44]  P. Brazhnik,et al.  Gene networks: how to put the function in genomics. , 2002, Trends in biotechnology.

[45]  D. Botstein,et al.  Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF , 2001, Nature.

[46]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Y. Furukawa,et al.  The human programmed cell death-2 (PDCD2) gene is a target of BCL6 repression: Implications for a role of BCL6 in the down-regulation of apoptosis , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[48]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[49]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[50]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[51]  Ritsert C. Jansen,et al.  Studying complex biological systems using multifactorial perturbation , 2003, Nature Reviews Genetics.

[52]  Dhaval P. Makhecha,et al.  Unravelling gene networks from noisy under-determined experimental perturbation data. , 2006, Systems biology.

[53]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Jingyuan Fu,et al.  Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans , 2006, PLoS genetics.

[55]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[56]  R. Spielman,et al.  Genetics of quantitative variation in human gene expression. , 2003, Cold Spring Harbor symposia on quantitative biology.

[57]  David Botstein,et al.  Promoter-specific binding of Rap1 revealed by genome-wide maps of protein–DNA association , 2001, Nature Genetics.

[58]  D di Bernardo,et al.  Inference of gene networks from temporal gene expression profiles. , 2007, IET systems biology.

[59]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[60]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..

[61]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[62]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[63]  A. van Hoof,et al.  Messenger RNA regulation: to translate or to degrade , 2008, The EMBO journal.

[64]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[65]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[66]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[67]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.