Lessons from the DREAM 2 Challenges A Community Effort to Assess Biological Network Inference

Regardless of how creative, innovative, and elegant our computational methods, the ultimate proof of an algorithm’s worth is the experimentally validated quality of its predictions. Unfortunately, this truism is hard to reduce to practice. Usually, modelers produce hundreds to hundreds of thousands of predictions, most (if not all) of which go untested. In a best-case scenario, a small subsample of predictions (three to ten usually) is experimentally validated, as a quality control step to attest to the global soundness of the full set of predictions. However, whether this small set is even representative of the global algorithm’s performance is a question usually left unaddressed. Thus, a clear understanding of the strengths and weaknesses of an algorithm most often remains elusive, especially to the experimental biologists who must decide which tool to use to address a specific problem. In this chapter, we describe the first systematic set of challenges posed to the systems biology community in the framework of the DREAM (Dialogue for Reverse Engineering Assessments and Methods) project. These tests, which came to be known as the DREAM2 challenges, consist of data generously donated by participants to the DREAM project and curated in such a way as to become problems of network reconstruction and whose solutions, the actual networks behind the data, were withheld from the participants. The explanation of the resulting five challenges, a global comparison of the submissions, and a discussion of the best performing strategies are the main topics discussed.

[1]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[2]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[3]  D. Floreano,et al.  Replaying the Evolutionary Tape: Biomimetic Reverse Engineering of Gene Networks , 2009, Annals of the New York Academy of Sciences.

[4]  V. Thorsson,et al.  A Data Integration Framework for Prediction of Transcription Factor Targets , 2009, Annals of the New York Academy of Sciences.

[5]  John J. Rice,et al.  Analyzing and reconstructing gene regulatory networks , 2005 .

[6]  Alberto de la Fuente,et al.  Inferring Gene Networks: Dream or Nightmare? , 2009, Annals of the New York Academy of Sciences.

[7]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[8]  John J. Rice,et al.  Making the most of it: pathway reconstruction and integrative simulation using the data at hand , 2004 .

[9]  Tian Zheng,et al.  Inference of Regulatory Gene Interactions from Expression Data Using Three‐Way Mutual Information , 2009, Annals of the New York Academy of Sciences.

[10]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[11]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[12]  Julio Collado-Vides,et al.  RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation , 2007, Nucleic Acids Res..

[13]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[14]  A. Califano,et al.  Dialogue on Reverse‐Engineering Assessment and Methods , 2007, Annals of the New York Academy of Sciences.

[15]  Jeremiah J. Faith,et al.  Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata , 2007, Nucleic Acids Res..

[16]  John Moult,et al.  Rigorous performance evaluation in protein structure modelling and implications for computational biology , 2006, Philosophical Transactions of the Royal Society B: Biological Sciences.

[17]  Philippe Ruminy,et al.  The BCL6 proto-oncogene: a leading role during germinal center development and lymphomagenesis. , 2007, Pathologie-biologie.

[18]  Sarma Vrudhula,et al.  Prediction of Pairwise Gene Interaction Using Threshold Logic , 2009, Annals of the New York Academy of Sciences.

[19]  Kenny Q. Ye,et al.  Transcriptional signature with differential expression of BCL6 target genes accurately identifies BCL6-dependent diffuse large B cell lymphomas , 2007, Proceedings of the National Academy of Sciences.

[20]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Andrea Califano,et al.  ChIP-on-chip significance analysis reveals large-scale binding and regulation by human transcription factor oncogenes , 2007, Proceedings of the National Academy of Sciences.

[22]  Guimei Liu,et al.  A Probabilistic Graph‐Theoretic Approach to Integrate Multiple Predictions for the Protein–Protein Subnetwork Prediction Challenge , 2009, Annals of the New York Academy of Sciences.

[23]  Guillaume Bourque,et al.  Inferring Direct Regulatory Targets of a Transcription Factor in the DREAM2 Challenge , 2009, Annals of the New York Academy of Sciences.

[24]  Francesco Iorio,et al.  NIRest: A Tool for Gene Network and Mode of Action Inference , 2009, Annals of the New York Academy of Sciences.

[25]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.