ModuleDiscoverer: Identification of regulatory modules in protein-protein interaction networks

The identification of disease-associated modules based on protein-protein interaction networks (PPINs) and gene expression data has provided new insights into the mechanistic nature of diverse diseases. However, their identification is hampered by the detection of protein communities within large-scale, whole-genome PPINs. A presented successful strategy detects a PPIN’s community structure based on the maximal clique enumeration problem (MCE), which is a non-deterministic polynomial time-hard problem. This renders the approach computationally challenging for large PPINs implying the need for new strategies. We present ModuleDiscoverer, a novel approach for the identification of regulatory modules from PPINs and gene expression data. Following the MCE-based approach, ModuleDiscoverer uses a randomization heuristic-based approximation of the community structure. Given a PPIN of Rattus norvegicus and public gene expression data, we identify the regulatory module underlying a rodent model of non-alcoholic steatohepatitis (NASH), a severe form of non-alcoholic fatty liver disease (NAFLD). The module is validated using single-nucleotide polymorphism (SNP) data from independent genome-wide association studies and gene enrichment tests. Based on gene enrichment tests, we find that ModuleDiscoverer performs comparably to three existing module-detecting algorithms. However, only our NASH-module is significantly enriched with genes linked to NAFLD-associated SNPs. ModuleDiscoverer is available at http://www.hki-jena.de/index.php/0/2/490 (Others/ModuleDiscoverer).

[1]  B. Ruggeri,et al.  Animal models of human disease: challenges in enabling translation. , 2014, Biochemical pharmacology.

[2]  Thomas Pfeiffer,et al.  Exploring the pathway structure of metabolism: decomposition into subnetworks and application to Mycoplasma pneumoniae , 2002, Bioinform..

[3]  R. Green,et al.  The Unfolded Protein Response in Fatty Liver Disease , 2013, Seminars in Liver Disease.

[4]  Edwin Wang,et al.  Network Analysis Reveals A Signaling Regulatory Loop in the PIK3CA-mutated Breast Cancer Predicting Survival Outcome , 2017, Genom. Proteom. Bioinform..

[5]  Udo Hahn,et al.  Genome-Wide Scale-Free Network Inference for Candida albicans , 2012, Front. Microbio..

[6]  Philippe Lefebvre,et al.  Molecular mechanism of PPARα action and its impact on lipid metabolism, inflammation and fibrosis in non-alcoholic fatty liver disease. , 2015, Journal of hepatology.

[7]  Yongchao Ge Resampling-based Multiple Testing for Microarray Data Analysis , 2003 .

[8]  Edwin Wang,et al.  Signaling network analysis of ubiquitin-mediated proteins suggests correlations between the 26S proteasome and tumor progression. , 2009, Molecular bioSystems.

[9]  F. Nassir,et al.  Role of Mitochondria in Nonalcoholic Fatty Liver Disease , 2014, International journal of molecular sciences.

[10]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[11]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[12]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[13]  Masoud Y Al Maskari,et al.  Correlation between Serum Leptin Levels, Body Mass Index and Obesity in Omanis. , 2006, Sultan Qaboos University medical journal.

[14]  Reinhard Guthke,et al.  ModuleDiscoverer: Identification of regulatory modules in protein-protein interaction networks , 2017 .

[15]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[16]  Ohad Parnes,et al.  Inflammation , 2008, The Lancet.

[17]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[18]  Weisong Liu,et al.  The Rat Genome Database 2015: genomic, phenotypic and environmental variations and disease , 2014, Nucleic Acids Res..

[19]  M. Jarvelin,et al.  Highly interconnected genes in disease-specific networks are enriched for disease-associated polymorphisms , 2012, Genome Biology.

[20]  Ron Shamir,et al.  Identification of functional modules using network topology and high-throughput data , 2007, BMC Systems Biology.

[21]  Richard M. Karp,et al.  DEGAS: De Novo Discovery of Dysregulated Pathways in Human Diseases , 2010, PloS one.

[22]  Michael A. Langston,et al.  The maximum clique enumeration problem: algorithms, applications, and implementations , 2011, BMC Bioinformatics.

[23]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[24]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[25]  Albert-László Barabási,et al.  A DIseAse MOdule Detection (DIAMOnD) Algorithm Derived from a Systematic Analysis of Connectivity Patterns of Disease Proteins in the Human Interactome , 2015, PLoS Comput. Biol..

[26]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[27]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[28]  Prahlad T. Ram,et al.  Formation of Regulatory Patterns During Signal Propagation in a Mammalian Cellular Network , 2005, Science.

[29]  K. Mullane,et al.  Animal models of asthma: reprise or reboot? , 2014, Biochemical pharmacology.

[30]  Jan Baumbach,et al.  KeyPathwayMiner: Detecting Case-Specific Biological Pathways Using Expression Data , 2011, Internet Math..

[31]  R. Karim,et al.  NASH and insulin resistance: Insulin hypersecretion and specific association with the insulin resistance syndrome , 2002, Hepatology.

[32]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[33]  Zhi-Ping Liu,et al.  Identifying module biomarker in type 2 diabetes mellitus by discriminative area of functional activity , 2015, BMC Bioinformatics.

[34]  Colm E. Nestor,et al.  Integrated genomic and prospective clinical studies show the importance of modular pleiotropy for disease susceptibility, diagnosis and treatment , 2014, Genome Medicine.

[35]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[36]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[37]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[38]  Edwin Wang,et al.  Signaling network assessment of mutations and copy number variations predict breast cancer subtype-specific drug targets. , 2013, Cell reports.

[39]  A. Barabasi,et al.  A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma. , 2015, Human molecular genetics.

[40]  Yuji Ogawa,et al.  Rodent Models of Nonalcoholic Fatty Liver Disease/Nonalcoholic Steatohepatitis , 2013, International journal of molecular sciences.

[41]  Vanessa Souza-Mello,et al.  Peroxisome proliferator-activated receptors as targets to treat non-alcoholic fatty liver disease. , 2015, World journal of hepatology.

[42]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[43]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[44]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[45]  D. Webb Animal models of human disease: inflammation. , 2014, Biochemical pharmacology.

[46]  Leah Hennings,et al.  A new model for nonalcoholic steatohepatitis in the rat utilizing total enteral nutrition to overfeed a high-polyunsaturated fat diet. , 2008, American journal of physiology. Gastrointestinal and liver physiology.

[47]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[48]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[49]  Jérôme Euzenat,et al.  Grasping at molecular interactions and genetic networks in Drosophila melanogaster using FlyNets, an Internet database , 1999, Nucleic Acids Res..

[50]  Kang K. L. Liu,et al.  Focus on the emerging new fields of network physiology and network medicine , 2016, New journal of physics.

[51]  Susan Cheng,et al.  Metabolite Profiling Identifies Pathways Associated With Metabolic Risk in Humans , 2012, Circulation.

[52]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[53]  T. Ideker,et al.  Integrative approaches for finding modular structure in biological networks , 2013, Nature Reviews Genetics.

[54]  Jason Y. Liu,et al.  Analysis of protein sequence and interaction data for candidate disease gene prediction , 2006, Nucleic acids research.

[55]  Rohit Loomba,et al.  Polyunsaturated fatty acid metabolites as novel lipidomic biomarkers for noninvasive diagnosis of nonalcoholic steatohepatitis1 , 2015, Journal of Lipid Research.

[56]  Catherine G. Harwood,et al.  Host Pathogen Relations: Exploring Animal Models for Fungal Pathogens , 2014, Pathogens.

[57]  Núria Queralt-Rosinach,et al.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes , 2015, Database J. Biol. Databases Curation.

[58]  Gary D Bader,et al.  Global Mapping of the Yeast Genetic Interaction Network , 2004, Science.

[59]  Jan Baumbach,et al.  On the performance of de novo pathway enrichment , 2017, npj Systems Biology and Applications.