KNeMAP: a network mapping approach for knowledge-driven comparison of transcriptomic profiles

Abstract Motivation Transcriptomic data can be used to describe the mechanism of action (MOA) of a chemical compound. However, omics data tend to be complex and prone to noise, making the comparison of different datasets challenging. Often, transcriptomic profiles are compared at the level of individual gene expression values, or sets of differentially expressed genes. Such approaches can suffer from underlying technical and biological variance, such as the biological system exposed on or the machine/method used to measure gene expression data, technical errors and further neglect the relationships between the genes. We propose a network mapping approach for knowledge-driven comparison of transcriptomic profiles (KNeMAP), which combines genes into similarity groups based on multiple levels of prior information, hence adding a higher-level view onto the individual gene view. When comparing KNeMAP with fold change (expression) based and deregulated gene set-based methods, KNeMAP was able to group compounds with higher accuracy with respect to prior information as well as is less prone to noise corrupted data. Result We applied KNeMAP to analyze the Connectivity Map dataset, where the gene expression changes of three cell lines were analyzed after treatment with 676 drugs as well as the Fortino et al. dataset where two cell lines with 31 nanomaterials were analyzed. Although the expression profiles across the biological systems are highly different, KNeMAP was able to identify sets of compounds that induce similar molecular responses when exposed on the same biological system. Availability and implementation Relevant data and the KNeMAP function is available at: https://github.com/fhaive/KNeMAP and 10.5281/zenodo.7334711.

[1]  S. Lamponi,et al.  A Comparative Study between Lycorine and Galantamine Abilities to Interact with AMYLOID β and Reduce In Vitro Neurotoxicity , 2023, International journal of molecular sciences.

[2]  D. Greco,et al.  The potential of a data centred approach & knowledge graph data representation in chemical safety and drug design , 2022, Computational and structural biotechnology journal.

[3]  J. Kere,et al.  Biomarkers of nanomaterials hazard from multi-layer data , 2022, Nature Communications.

[4]  D. Greco,et al.  Characterization of ENM Dynamic Dose-Dependent MOA in Lung with Respect to Immune Cells Infiltration , 2022, Nanomaterials.

[5]  A. Ciccodicola,et al.  Integrated Network Pharmacology Approach for Drug Combination Discovery: A Multi-Cancer Case Study , 2022, Cancers.

[6]  D. Greco,et al.  Nextcast: A software suite to analyse and model toxicogenomics data , 2022, Computational and structural biotechnology journal.

[7]  Alexander Lachmann,et al.  blitzGSEA: efficient computation of gene set enrichment analysis through gamma distribution approximation , 2022, Bioinform..

[8]  D. Greco,et al.  Unsupervised Algorithms for Microarray Sample Stratification. , 2021, Methods in molecular biology.

[9]  Angela Serra,et al.  VOLTA: adVanced mOLecular neTwork Analysis , 2021, Bioinform..

[10]  J. Kere,et al.  Toxicogenomic Profiling of 28 Nanomaterials in Mouse Airways , 2021, Advanced science.

[11]  D. Greco,et al.  Integrated network analysis reveals new genes suggesting COVID-19 chronic effects and treatment , 2021, Briefings Bioinform..

[12]  Lu Han,et al.  Modeling drug mechanism of action with large scale gene-expression profiles using GPAR, an artificial intelligence platform , 2021, BMC Bioinform..

[13]  Anushya Muruganujan,et al.  The Gene Ontology resource: enriching a GOld mine , 2020, Nucleic Acids Res..

[14]  D. Greco,et al.  Toxicogenomics analysis of dynamic dose-response in macrophages highlights molecular alterations relevant for multi-walled carbon nanotube-induced lung fibrosis , 2020 .

[15]  J. Kere,et al.  Multiparametric Profiling of Engineered Nanomaterials: Unmasking the Surface Coating Effect , 2020, Advanced science.

[16]  Sriparna Saha,et al.  Multi-view clustering for multi-omics data using unified embedding , 2020, Scientific Reports.

[17]  Nancy Mah,et al.  Scoring functions for drug-effect similarity , 2020, Briefings Bioinform..

[18]  Haralambos Sarimveis,et al.  Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data , 2020, Nanomaterials.

[19]  D. Greco,et al.  Carbon Nanomaterials Promote M1/M2 Macrophage Activation. , 2020, Small.

[20]  Haralambos Sarimveis,et al.  Transcriptomics in Toxicogenomics, Part I: Experimental Design, Technologies, Publicly Available Data, and Regulatory Aspects , 2020, Nanomaterials.

[21]  Haralambos Sarimveis,et al.  Transcriptomics in Toxicogenomics, Part III: Data Modelling for Risk Assessment , 2020, Nanomaterials.

[22]  Igor L. Medintz,et al.  Quantum Dots and Gold Nanoparticles as Scaffolds for Enzymatic Enhancement: Recent Advances and the Influence of Nanoparticle Size , 2020, Catalysts.

[23]  P. Pavlidis,et al.  Evaluation of connectivity map shows limited reproducibility in drug repositioning , 2019, Scientific Reports.

[24]  Joel Nothman,et al.  SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[25]  M. Bebianno,et al.  Effects of Copper Oxide Nanoparticles on Tissue Accumulation and Antioxidant Enzymes of Galleria mellonella L. , 2019, Bulletin of Environmental Contamination and Toxicology.

[26]  R. Shamir,et al.  Multi-omic and multi-view clustering algorithms: review and cancer benchmark , 2018, Nucleic acids research.

[27]  Vittorio Fortino,et al.  eUTOPIA: solUTion for Omics data PreprocessIng and Analysis , 2018, bioRxiv.

[28]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[29]  Asher Mullard Can you trust your cancer cell lines? , 2018, Nature reviews. Drug discovery.

[30]  Michael L. Waskom,et al.  mwaskom/seaborn: v0.9.0 (July 2018) , 2018 .

[31]  Vittorio Fortino,et al.  Integration of genome-wide mRNA and miRNA expression, and DNA methylation data of three cell lines exposed to ten carbon nanomaterials , 2018, Data in brief.

[32]  Roberto Tagliaferri,et al.  Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data , 2018, Bioinform..

[33]  A. Shaw,et al.  Tumour heterogeneity and resistance to cancer therapies , 2018, Nature Reviews Clinical Oncology.

[34]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[35]  Angela N. Brooks,et al.  A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles , 2017, Cell.

[36]  Petri Auvinen,et al.  Network Analysis Reveals Similar Transcriptomic Responses to Intrinsic Properties of Carbon Nanomaterials in Vitro and in Vivo. , 2017, ACS nano.

[37]  Johann A. Gagnon-Bartsch,et al.  Systematic noise degrades gene co-expression signals but can be corrected , 2015, BMC Bioinformatics.

[38]  Giancarlo Raiconi,et al.  MVDA: a multi-view genomic data integration methodology , 2015, BMC Bioinformatics.

[39]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[40]  Frederic P. Miller,et al.  Levenshtein Distance: Information theory, Computer science, String (computer science), String metric, Damerau?Levenshtein distance, Spell checker, Hamming distance , 2009 .

[41]  V. Klimov,et al.  Hybrid gold/silica/nanocrystal-quantum-dot superstructures: synthesis and analysis of semiconductor-metal interactions. , 2006, Journal of the American Chemical Society.

[42]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[43]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  J. Raser,et al.  Noise in Gene Expression: Origins, Consequences, and Control , 2005, Science.

[45]  D. di Bernardo,et al.  Identification of small molecules enhancing autophagic function from drug network analysis. , 2010, Autophagy.