Open Community Challenge Reveals Molecular Network Modules with Key Roles in Diseases

Identification of modules in molecular networks is at the core of many current analysis methods in biomedical research. However, how well different approaches identify disease-relevant modules in different types of networks remains poorly understood. We launched the "Disease Module Identification DREAM Challenge", an open competition to comprehensively assess module identification methods across diverse gene, protein and signaling networks. Predicted network modules were tested for association with complex traits and diseases using a unique collection of 180 genome-wide association studies (GWAS). While a number of approaches were successful in terms of discovering complementary trait-associated modules, consensus predictions derived from the challenge submissions performed best. We find that most of these modules correspond to core disease-relevant pathways, which often comprise therapeutic targets and correctly prioritize candidate disease genes. This community challenge establishes benchmarks, tools and guidelines for molecular network analysis to study human disease biology (https://synapse.org/modulechallenge).

[1]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[2]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[3]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[4]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[6]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[7]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[8]  A. Kimura,et al.  Chromosomal gradient of histone acetylation established by Sas2p and Sir2p functions as a shield against gene silencing , 2002, Nature Genetics.

[9]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[11]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[12]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[13]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[14]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[15]  R. Sharan,et al.  Protein networks in disease. , 2008, Genome research.

[16]  S. Horvath,et al.  Variations in DNA elucidate molecular networks that cause disease , 2008, Nature.

[17]  Matthew A. Hibbs,et al.  Exploring the human genome with functional maps. , 2009, Genome research.

[18]  Matthew D. Young,et al.  Gene ontology analysis for RNA-seq: accounting for selection bias , 2010, Genome Biology.

[19]  Carl T. Bergstrom,et al.  The map equation , 2009, 0906.1405.

[20]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[21]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[22]  Peng Jiang,et al.  SPICi: a fast clustering algorithm for large biological networks , 2010, Bioinform..

[23]  Srinivasan Parthasarathy,et al.  Markov clustering of protein interaction networks with improved balance and scalability , 2010, BCB '10.

[24]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[25]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[26]  R. Norel,et al.  The self-assessment trap: can we all be better than average? , 2011, Molecular systems biology.

[27]  S. Friend,et al.  Developing predictive molecular maps of human disease through community-based modeling , 2011, Nature Genetics.

[28]  A. Butte,et al.  Leveraging models of cell regulation and GWAS data in integrative network-based association studies , 2012, Nature Genetics.

[29]  Richard Bonneau,et al.  A Validated Regulatory Network for Th17 Cell Specification , 2012, Cell.

[30]  Santo Fortunato,et al.  Consensus clustering in complex networks , 2012, Scientific Reports.

[31]  Lenore Cowen,et al.  Genecentric: a package to uncover graph-theoretic structure in high-throughput epistasis data , 2012, BMC Bioinformatics.

[32]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[33]  Jooyoung Lee,et al.  Mod-CSA: Modularity optimization by conformational space annealing , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[35]  Noah M. Daniels,et al.  Going the Distance for Protein Function Prediction: A New Distance Metric for Protein Interaction Networks , 2013, PloS one.

[36]  Ross M. Fraser,et al.  Sex-stratified Genome-wide Association Studies Including 270,000 Individuals Show Sexual Dimorphism in Genetic Loci for Anthropometric Traits , 2013, PLoS genetics.

[37]  Murim Choi,et al.  Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens , 2014, Nature Genetics.

[38]  Ellen T. Gelfand,et al.  Parallel genome-scale loss of function screens in 216 cancer cell lines for the identification of context-specific genetic dependencies , 2014, Scientific Data.

[39]  Alexandre Arenas,et al.  Identifying modular flows on multilayer networks reveals highly overlapping organization in social systems , 2014, ArXiv.

[40]  Samantha A. Morris,et al.  CellNet: Network Biology Applied to Stem Cell Engineering , 2014, Cell.

[41]  Yuval Kluger,et al.  Ranking and combining multiple predictors without labeled data , 2013, Proceedings of the National Academy of Sciences.

[42]  Kimberly Glass,et al.  Annotation Enrichment Analysis: An Alternative Method for Evaluating the Functional Properties of Gene Sets , 2012, Scientific Reports.

[43]  Mariano J. Alvarez,et al.  Identification of Causal Genetic Drivers of Human Disease through Systems-Level Analysis of Regulatory Networks , 2014, Cell.

[44]  Lenore Cowen,et al.  New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence , 2014, Bioinform..

[45]  V. Mootha,et al.  Expansion of Biological Pathways Based on Evolutionary Inference , 2014, Cell.

[46]  Joseph K. Pickrell Joint analysis of functional genomic data and genome-wide association studies of 18 human traits , 2013, bioRxiv.

[47]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[48]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[49]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[50]  Yasuhiro Fujiwara,et al.  SCAN++: Efficient Algorithm for Finding Clusters, Hubs and Outliers on Large-scale Graphs , 2015, Proc. VLDB Endow..

[51]  P. Sullivan,et al.  Biological pathways and networks implicated in psychiatric disorders , 2015, Current Opinion in Behavioral Sciences.

[52]  Gilles Didier,et al.  Identifying communities from multiplex biological networks , 2015, PeerJ.

[53]  Jiashun Jin,et al.  FAST COMMUNITY DETECTION BY SCORE , 2012, 1211.5803.

[54]  J. Hirschhorn,et al.  Biological interpretation of genome-wide association studies using predicted gene functions , 2015, Nature Communications.

[55]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[56]  Daniel S. Himmelstein,et al.  Understanding multicellular function and disease with human tissue-specific networks , 2015, Nature Genetics.

[57]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[58]  T. Madhusudhan,et al.  The emerging role of coagulation proteases in kidney disease , 2016, Nature Reviews Nephrology.

[59]  Daniel Marbach,et al.  Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics , 2016, PLoS Comput. Biol..

[60]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[61]  Stefan Kramer,et al.  Graph Clustering with Density-Cut , 2016, ArXiv.

[62]  Stephen C. J. Parker,et al.  The genetic architecture of type 2 diabetes , 2016, Nature.

[63]  Shan He,et al.  Active modules for multilayer weighted gene co-expression networks: a continuous optimization approach , 2016, bioRxiv.

[64]  Julio Saez-Rodriguez,et al.  OmniPath: guidelines and gateway for literature-curated signaling pathway resources , 2016, Nature Methods.

[65]  Jaclyn N. Taroni,et al.  Integrative Networks Illuminate Biological Factors Underlying Gene–Disease Associations , 2016, bioRxiv.

[66]  Daniel Marbach,et al.  Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases , 2016, Nature Methods.

[67]  Marcelo P. Segura-Lepe,et al.  Rare and low-frequency coding variants alter human adult height , 2016, Nature.

[68]  Yang I Li,et al.  An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.

[69]  Defining a Cancer Dependency Map , 2017, Cell.

[70]  W. Sessa,et al.  Contemporary Approaches to Modulating the Nitric Oxide–cGMP Pathway in Cardiovascular Disease , 2017, Circulation research.

[71]  Angela N. Brooks,et al.  A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles , 2017, Cell.

[72]  S. Brunak,et al.  A scored human protein–protein interaction network to catalyze genomic interpretation , 2017, Nature Methods.

[73]  M. Neurath Current and emerging therapeutic targets for IBD , 2017, Nature Reviews Gastroenterology &Hepatology.

[74]  A. Hofman,et al.  Disease variants alter transcription factor levels and methylation of their binding sites , 2016, Nature Genetics.

[75]  Judith A. Blake,et al.  Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse , 2016, Nucleic Acids Res..

[76]  Yijie Wang,et al.  Finding low-conductance sets with dense interactions (FLCD) for better protein complex prediction , 2016, BMC Systems Biology.

[77]  Benjamin J. Raphael,et al.  Network propagation: a universal amplifier of genetic associations , 2017, Nature Reviews Genetics.

[78]  David C. Wilson,et al.  Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease , 2016, Nature Genetics.

[79]  Wei Zhang,et al.  Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. , 2018, Cell systems.

[80]  Bang Wong,et al.  GeNets: A unified web platform for network-based analyses of genomic data , 2018, Nature Methods.