MultiPaths: a Python framework for analyzing multi-layer biological networks using diffusion algorithms

Summary High-throughput screening yields vast amounts of biological data which can be highly challenging to interpret. In response, knowledge-driven approaches emerged as possible solutions to analyze large datasets by leveraging prior knowledge of biomolecular interactions represented in the form of biological networks. Nonetheless, given their size and complexity, their manual investigation quickly becomes impractical. Thus, computational approaches, such as diffusion algorithms, are often employed to interpret and contextualize the results of high-throughput experiments. Here, we present MultiPaths, a framework consisting of two independent Python packages for network analysis. While the first package, DiffuPy, comprises numerous state-of-the-art diffusion algorithms applicable to any generic network, the second, DiffuPath, enables the application of these algorithms on multi-layer biological networks. To facilitate its usability, the framework includes a command line interface, reproducible examples, and documentation. To demonstrate the framework, we conducted several diffusion experiments on three independent multi-omics datasets over disparate networks generated from pathway databases, thus, highlighting the ability of multi-layer networks to integrate multiple modalities. Finally, the results of these experiments demonstrate how the generation of harmonized networks from disparate databases can improve predictive performance with respect to individual resources. Availability DiffuPy and DiffuPath are publicly available under the Apache License 2.0 at https://github.com/multipaths. Contact sergi.picart@upc.edu and daniel.domingo.fernandez@scai.fraunhofer.de

[1]  David Warde-Farley,et al.  GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function , 2008, Genome Biology.

[2]  Fei Wang,et al.  miRTarBase 2020: updates to the experimentally validated microRNA–target interaction database , 2019, Nucleic Acids Res..

[3]  Giuliano Armano,et al.  RANKS: a flexible tool for node label ranking and classification in biological networks , 2016, Bioinform..

[4]  Benjamin J. Raphael,et al.  Network propagation: a universal amplifier of genetic associations , 2017, Nature Reviews Genetics.

[5]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[6]  Zaïd Harchaoui,et al.  Signal Processing , 2013, 2020 27th International Conference on Mixed Design of Integrated Circuits and System (MIXDES).

[7]  Bernhard Schölkopf,et al.  Fast protein classification with multiple networks , 2005, ECCB/JBI.

[8]  Martin Hofmann-Apitius,et al.  PathMe: Merging and exploring mechanistic pathway knowledge , 2019, BMC Bioinform..

[9]  Hang-Hyun Jo,et al.  Tail-scope: Using friends to estimate heavy tails of degree distributions in large-scale complex networks , 2014, Scientific Reports.

[10]  Wesley K. Thompson,et al.  diffuStats: an R package to compute diffusion‐based scores on biological networks , 2018, Bioinform..

[11]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[12]  K. Iwaisako,et al.  Comprehensive analysis of transcriptome and metabolome analysis in Intrahepatic Cholangiocarcinoma and Hepatocellular Carcinoma , 2015, Scientific Reports.

[13]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[14]  François Fouss,et al.  Graph Nodes Clustering Based on the Commute-Time Kernel , 2007, PAKDD.

[15]  A. Barabasi,et al.  Human symptoms–disease network , 2014, Nature Communications.

[16]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[17]  Wei Zhang,et al.  Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. , 2018, Cell systems.

[18]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[19]  Xin Lu,et al.  Integration of Metabolomics and Transcriptomics Reveals Major Metabolic Pathways and Potential Biomarker Involved in Prostate Cancer* , 2015, Molecular & Cellular Proteomics.

[20]  Alexandre Perera-Lluna,et al.  Null diffusion-based enrichment for metabolomics data , 2017, PloS one.

[21]  Martin Hofmann-Apitius,et al.  Integration of Structured Biological Data Sources using Biological Expression Language , 2019, bioRxiv.

[22]  Gary D. Bader,et al.  netDx: Interpretable patient classification using integrated patient similarity networks , 2016, bioRxiv.

[23]  L. Milanesi,et al.  Network diffusion-based analysis of high-throughput data for the detection of differentially enriched modules , 2016, Scientific Reports.

[24]  Martin Hofmann-Apitius,et al.  PathMe: merging and exploring mechanistic pathway knowledge , 2018, BMC Bioinformatics.

[25]  Ryan Miller,et al.  WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research , 2017, Nucleic Acids Res..

[26]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[27]  Eli Upfal,et al.  Algorithms for Detecting Significantly Mutated Pathways in Cancer , 2010, RECOMB.

[28]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[29]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[30]  A. Barabasi,et al.  Uncovering disease-disease relationships through the incomplete interactome , 2015, Science.

[31]  Ettore Mosca,et al.  Network Diffusion Promotes the Integrative Analysis of Multiple Omics , 2020, Frontiers in Genetics.

[32]  Thawfeek M. Varusai,et al.  The Reactome Pathway Knowledgebase , 2017, Nucleic acids research.

[33]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[34]  M. Eremets,et al.  Ammonia as a case study for the spontaneous ionization of a simple hydrogen-bonded compound , 2014, Nature Communications.

[35]  Stan Gaj,et al.  Integrating multiple omics to unravel mechanisms of Cyclosporin A induced hepatotoxicity in vitro. , 2015, Toxicology in vitro : an international journal published in association with BIBRA.

[36]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[37]  Daniel Blankenberg,et al.  Software engineering for scientific big data analysis , 2019, GigaScience.