Correlation AnalyzeR: functional predictions from gene co-expression correlations

Background Co-expression correlations provide the ability to predict gene functionality within specific biological contexts, such as different tissue and disease conditions. However, current gene co-expression databases generally do not consider biological context. In addition, these tools often implement a limited range of unsophisticated analysis approaches, diminishing their utility for exploring gene functionality and gene relationships. Furthermore, they typically do not provide the summary visualizations necessary to communicate these results, posing a significant barrier to their utilization by biologists without computational skills. Results We present Correlation AnalyzeR, a user-friendly web interface for exploring co-expression correlations and predicting gene functions, gene–gene relationships, and gene set topology. Correlation AnalyzeR provides flexible access to its database of tissue and disease-specific (cancer vs normal) genome-wide co-expression correlations, and it also implements a suite of sophisticated computational tools for generating functional predictions with user-friendly visualizations. In the usage example provided here, we explore the role of BRCA1-NRF2 interplay in the context of bone cancer, demonstrating how Correlation AnalyzeR can be effectively implemented to generate and support novel hypotheses. Conclusions Correlation AnalyzeR facilitates the exploration of poorly characterized genes and gene relationships to reveal novel biological insights. The database and all analysis methods can be accessed as a web application at https://gccri.bishop-lab.uthscsa.edu/correlation-analyzer/ and as a standalone R package at https://github.com/Bishop-Laboratory/correlationAnalyzeR .

[1]  Weimin Chen,et al.  Nrf2 enhances resistance of cancer cells to chemotherapeutic drugs, the dark side of Nrf2. , 2008, Carcinogenesis.

[2]  Sapna Kumari,et al.  Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery , 2012, PloS one.

[3]  Olga G. Troyanskaya,et al.  GIANT 2.0: genome-scale integrated analysis of gene networks in tissues , 2018, Nucleic Acids Res..

[4]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[5]  Sören Müller,et al.  Single-cell Co-expression Subnetwork Analysis , 2017, Scientific Reports.

[6]  K. Hansen,et al.  Co-expression analysis is biased by a mean-correlation relationship , 2020, bioRxiv.

[7]  M. King,et al.  Breast and Ovarian Cancer Risks Due to Inherited Mutations in BRCA1 and BRCA2 , 2003, Science.

[8]  B. Koller,et al.  Brca1 deficiency results in early embryonic lethality characterized by neuroepithelial abnormalities , 1996, Nature Genetics.

[9]  K. Stegmaier,et al.  Author Correction: EWS–FLI1 increases transcription to cause R-loops and block BRCA1 repair in Ewing sarcoma , 2018, Nature.

[10]  Hailin Chen,et al.  STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data , 2009, BMC Bioinformatics.

[11]  Kara Dolinski,et al.  The BioGRID interaction database: 2019 update , 2018, Nucleic Acids Res..

[12]  Joshua M. Korn,et al.  Next-generation characterization of the Cancer Cell Line Encyclopedia , 2019, Nature.

[13]  Jianqiang Li,et al.  Application of Weighted Gene Co-expression Network Analysis for Data from Paired Design , 2018, Scientific Reports.

[14]  Florian Engert,et al.  Exome sequencing of osteosarcoma reveals mutation signatures reminiscent of BRCA deficiency , 2015, Nature Communications.

[15]  Kengo Kinoshita,et al.  COXPRESdb v7: a gene coexpression database for 11 animal species supported by 23 coexpression platforms for technical evaluation and evolutionary inference , 2018, Nucleic Acids Res..

[16]  R. Irizarry ggplot2 , 2019, Introduction to Data Science.

[17]  Dataset of the frequency patterns of publications annotated to human protein-coding genes, their protein products and genetic relevance , 2019, Data in brief.

[18]  C. B. Pickett,et al.  The Nrf2-Antioxidant Response Element Signaling Pathway and Its Activation by Oxidative Stress* , 2009, Journal of Biological Chemistry.

[19]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[20]  João Pedro de Magalhães,et al.  GeneFriends: a human RNA-seq-based gene and transcript co-expression database , 2014, Nucleic Acids Res..

[21]  Alexey Sergushichev,et al.  An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation , 2016 .

[22]  Y. Miki,et al.  BRCA1 gene: function and deficiency , 2018, International Journal of Clinical Oncology.

[23]  Kathleen M Jagodnik,et al.  Massive mining of publicly available RNA-seq data from human and mouse , 2017, Nature Communications.

[24]  Sara Ballouz,et al.  Guidance for RNA-seq co-expression network construction and analysis: safety in numbers , 2015, Bioinform..

[25]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Gary D. Bader,et al.  GeneMANIA update 2018 , 2018, Nucleic Acids Res..

[27]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[28]  J. Cigudosa,et al.  Array CGH and gene-expression profiling reveals distinct genomic instability patterns associated with DNA repair and cell-cycle checkpoint pathways in Ewing's sarcoma , 2008, Oncogene.

[29]  S. Inoue,et al.  BRCA1 interacts with Nrf2 to regulate antioxidant signaling and cell survival , 2013, The Journal of experimental medicine.

[30]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[31]  Angelo J. Canty,et al.  Bootstrap Functions (Originally by Angelo Canty for S) , 2015 .

[32]  Yoshiyuki Ogata,et al.  Approaches for extracting practical information from gene co-expression networks in plant biology. , 2007, Plant & cell physiology.

[33]  Jonathan Sidi,et al.  heatmaply: an R package for creating interactive cluster heatmaps for online publishing , 2017, Bioinform..

[34]  M. Macleod,et al.  Regulation of BRCA1 Expression by the Rb-E2F Pathway* , 2000, The Journal of Biological Chemistry.

[35]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[36]  Tongxin Wang,et al.  Generalized gene co-expression analysis via subspace clustering using low-rank representation , 2019, BMC Bioinformatics.

[37]  J. Mesirov,et al.  The Molecular Signatures Database Hallmark Gene Set Collection , 2015 .

[38]  A. Bishop,et al.  Reconstruction of Ewing Sarcoma Developmental Context from Mass-Scale Transcriptomics Reveals Characteristics of EWSR1-FLI1 Permissibility , 2020, Cancers.

[39]  H. Nguyen,et al.  Cell cycle regulation of BRCA1 messenger RNA in human breast epithelial cells. , 1996, Cell growth & differentiation : the molecular biology journal of the American Association for Cancer Research.

[40]  F. Engert,et al.  Osteosarcoma cells with genetic signatures of BRCAness are susceptible to the PARP inhibitor talazoparib alone or in combination with chemotherapeutics , 2016, Oncotarget.

[41]  Mokhtar Abdullah,et al.  On a Robust Correlation Coefficient , 1990 .

[42]  T. Luechtefeld,et al.  Functionally Enigmatic Genes in Cancer: Using TCGA Data to Map the Limitations of Annotations , 2020, Scientific Reports.

[43]  Amos Bairoch,et al.  The Cellosaurus, a Cell-Line Knowledge Resource. , 2018, Journal of biomolecular techniques : JBT.

[44]  Kengo Kinoshita,et al.  COXPRESdb: a database of coexpressed gene networks in mammals , 2007, Nucleic Acids Res..

[45]  K. Pu,et al.  Identifying novel biomarkers in hepatocellular carcinoma by weighted gene co‐expression network analysis , 2019, Journal of cellular biochemistry.

[46]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[47]  Joost C F de Winter,et al.  Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. , 2016, Psychological methods.

[48]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[49]  H. Heyn,et al.  Single-cell transcriptomics unveils gene regulatory network plasticity , 2018, Genome Biology.

[50]  Jianhua Ruan,et al.  Building and analyzing protein interactome networks by cross-species comparisons , 2010, BMC Systems Biology.

[51]  B. Ueberheide,et al.  Nrf2 Activation Promotes Lung Cancer Metastasis by Inhibiting the Degradation of Bach1 , 2019, Cell.

[52]  Jianing Tang,et al.  Prognostic Genes of Breast Cancer Identified by Gene Co-expression Network Analysis , 2020 .

[53]  Kun Wang,et al.  Prognostic Genes of Breast Cancer Identified by Gene Co-expression Network Analysis , 2018, Front. Oncol..

[54]  Liis Kolberg,et al.  Co-expression analysis reveals interpretable gene modules controlled by trans-acting genetic variants , 2020, bioRxiv.

[55]  Dan Li,et al.  BRCA1 regulation of epidermal growth factor receptor (EGFR) expression in human breast cancer cells involves microRNA-146a and is critical for its tumor suppressor function , 2014, Oncogene.

[56]  Jessica C. Mar,et al.  Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data , 2018, BMC Bioinformatics.

[57]  Shuli Kang,et al.  Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network , 2011, Nucleic acids research.

[58]  Igor Dolgalev,et al.  MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format [R package msigdbr version 7.4.1] , 2021 .

[59]  A. Iwama,et al.  Multifaceted role of the polycomb-group gene EZH2 in hematological malignancies , 2016, International Journal of Hematology.

[60]  Yihui Xie,et al.  A Wrapper of the JavaScript Library 'DataTables' , 2015 .

[61]  M. Tepel,et al.  Expression of the NRF2 Target Gene NQO1 Is Enhanced in Mononuclear Cells in Human Chronic Kidney Disease , 2017, Oxidative medicine and cellular longevity.

[62]  Pedro M. Valero-Mora,et al.  ggplot2: Elegant Graphics for Data Analysis , 2010 .

[63]  Kevin R. Moon,et al.  Recovering Gene Interactions from Single-Cell Data Using Data Diffusion , 2018, Cell.

[64]  AnHai Doan,et al.  MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive , 2017, Bioinform..

[65]  U. Suresh,et al.  Combined Gene Expression and RNAi Screening to Identify Alkylation Damage Survival Pathways from Fly to Human , 2016, PloS one.