LMSM: A modular approach for identifying lncRNA related miRNA sponge modules in breast cancer

Until now, existing methods for identifying lncRNA related miRNA sponge modules mainly rely on lncRNA related miRNA sponge interaction networks, which may not provide a full picture of miRNA sponging activities in biological conditions. Hence there is a strong need of new computational methods to identify lncRNA related miRNA sponge modules. In this work, we propose a framework, LMSM, to identify LncRNA related MiRNA Sponge Modules from heterogeneous data. To understand the miRNA sponging activities in biological conditions, LMSM uses gene expression data to evaluate the influence of the shared miRNAs on the clustered sponge lncRNAs and mRNAs. We have applied LMSM to the human breast cancer (BRCA) dataset from The Cancer Genome Atlas (TCGA). As a result, we have found that the majority of LMSM modules are implicated in BRCA and most of them are BRCA subtype-specific. Most of the mediating miRNAs act as crosslinks across different LMSM modules. Moreover, the consistent results suggest that LMSM is robust in identifying lncRNA related miRNA sponge modules. Finally, LMSM can be used to predict miRNA targets. Altogether, our study shows that LMSM is a promising method to investigate modular regulatory mechanism of sponge lncRNAs from heterogeneous data.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[3]  B. L. Welch The generalisation of student's problems when several different population variances are involved. , 1947, Biometrika.

[4]  R. Gill,et al.  Cox's regression model for counting processes: a large sample study : (preprint) , 1982 .

[5]  P. Grambsch,et al.  Modeling Survival Data: Extending the Cox Model , 2000 .

[6]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[7]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[8]  S. Rafii,et al.  Splitting vessels: Keeping lymph apart from blood , 2003, Nature Medicine.

[9]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[10]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[11]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[12]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[13]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[14]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  P. Pandolfi,et al.  A ceRNA Hypothesis: The Rosetta Stone of a Hidden RNA Language? , 2011, Cell.

[17]  Everton Alvares Cherman,et al.  On the Estimation of Predictive Evaluation Measure Baselines for Multi-label Learning , 2012, IBERAMIA.

[18]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[19]  Justin Guinney,et al.  GSVA: gene set variation analysis for microarray and RNA-Seq data , 2013, BMC Bioinformatics.

[20]  Samuel Kaski,et al.  Bayesian Group Factor Analysis , 2012, AISTATS.

[21]  Martin Reczko,et al.  DIANA-microT web server v5.0: service integration into miRNA functional analysis workflows , 2013, Nucleic Acids Res..

[22]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[23]  P. Pandolfi,et al.  The multilayered complexity of ceRNA crosstalk and competition , 2014, Nature.

[24]  Lorenzo Farina,et al.  Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer , 2014, BMC Systems Biology.

[25]  Samuel Kaski,et al.  Cross-organism toxicogenomics with group factor analysis , 2014 .

[26]  Peng Wang,et al.  miRSponge: a manually curated database for experimentally supported miRNA sponges and ceRNAs , 2015, Database J. Biol. Databases Curation.

[27]  H. Dweep,et al.  miRWalk2.0: a comprehensive atlas of microRNA-target interactions , 2015, Nature Methods.

[28]  D. Bartel,et al.  Predicting effective microRNA target sites in mammalian mRNAs , 2015, eLife.

[29]  Athanasios Fevgas,et al.  DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA:mRNA interactions , 2014, Nucleic Acids Res..

[30]  Xia Li,et al.  Identification of module biomarkers from the dysregulated ceRNA-ceRNA interaction network in lung adenocarcinoma. , 2015, Molecular bioSystems.

[31]  Xia Li,et al.  Identification of lncRNA-associated competing triplets reveals global patterns and prognostic markers for cancer , 2015, Nucleic acids research.

[32]  Samuel Kaski,et al.  Group Factor Analysis , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Panayiotis Tsanakas,et al.  DIANA-LncBase v2: indexing microRNA targets on non-coding transcripts , 2015, Nucleic Acids Res..

[34]  Xia Li,et al.  Comprehensive characterization of lncRNA-mRNA related ceRNA network across 12 major cancers , 2016, Oncotarget.

[35]  Samuel Kaski,et al.  Sparse group factor analysis for biclustering of multiple data sources , 2015, Bioinform..

[36]  Joel S. Parker,et al.  Genefu: an R/Bioconductor package for computation of gene expression-based signatures in breast cancer , 2016, Bioinform..

[37]  Wei Wu,et al.  NPInter v3.0: an upgraded database of noncoding RNA-associated interactions , 2016, Database J. Biol. Databases Curation.

[38]  Melissa J. Fullwood,et al.  Roles, Functions, and Mechanisms of Long Non-coding RNAs in Cancer , 2016, Genom. Proteom. Bioinform..

[39]  Alexander Lex,et al.  UpSetR: an R package for the visualization of intersecting sets and their properties , 2017, bioRxiv.

[40]  F. Peng,et al.  H19/let-7/LIN28 reciprocal negative regulatory circuit promotes breast cancer stem cell maintenance , 2017, Cell Death & Disease.

[41]  A. Bhan,et al.  Long Noncoding RNA and Cancer: A New Paradigm. , 2017, Cancer research.

[42]  Junpeng Zhang,et al.  Inferring miRNA sponge co-regulation of protein-protein interactions in human breast cancer , 2017, BMC Bioinformatics.

[43]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[44]  Mingming Jia,et al.  COSMIC: somatic cancer genetics at high-resolution , 2016, Nucleic Acids Res..

[45]  Junpeng Zhang,et al.  Computational methods for identifying miRNA sponge interactions , 2016, Briefings Bioinform..

[46]  Fei Li,et al.  LncCeRBase: a database of experimentally validated human competing endogenous long non-coding RNAs , 2018, Database J. Biol. Databases Curation.

[47]  Hong Wang,et al.  Detection of dysregulated competing endogenous RNA modules associated with clear cell kidney carcinoma , 2018, Molecular medicine reports.

[48]  Carsten Sticht,et al.  miRWalk: An online resource for prediction of microRNA binding sites , 2018, PloS one.

[49]  Adriano Rivolli,et al.  The utiml Package: Multi-label Classification in R , 2018, R J..

[50]  Serdar Bozdag,et al.  Cancerin: A computational pipeline to infer cancer-associated ceRNA interaction networks , 2018, PLoS Comput. Biol..

[51]  Junpeng Zhang,et al.  LncmiRSRN: identification and analysis of long non‐coding RNA related miRNA sponge regulatory network in human cancer , 2018, Bioinform..

[52]  Yue Zhao,et al.  MNDR v2.0: an updated resource of ncRNA–disease associations in mammals , 2017, Nucleic Acids Res..

[53]  Jin Deng,et al.  Prior Knowledge Driven Joint NMF Algorithm for ceRNA Co-Module Identification , 2018, International journal of biological sciences.

[54]  Si-ying Zhou,et al.  The regulatory roles of lncRNAs in the process of breast cancer invasion and metastasis , 2018, Bioscience reports.

[55]  J. Mendell,et al.  Functional Classification and Experimental Dissection of Long Noncoding RNAs , 2018, Cell.

[56]  Junpeng Zhang,et al.  miRspongeR: an R/Bioconductor package for the identification and analysis of miRNA sponge interaction networks and modules , 2019, BMC Bioinformatics.

[57]  Zhen Yang,et al.  LncRNADisease 2.0: an updated database of long non-coding RNA-associated diseases , 2018, Nucleic Acids Res..

[58]  Yunpeng Zhang,et al.  LncACTdb 2.0: an updated database of experimentally supported ceRNA interactions curated from low- and high-throughput experiments , 2018, Nucleic Acids Res..

[59]  Li Wang,et al.  Lnc2Cancer v2.0: updated database of experimentally supported long non-coding RNAs in human cancers , 2018, Nucleic Acids Res..

[60]  Marcel H. Schulz,et al.  Large-scale inference of competing endogenous RNA networks with sparse partial correlation , 2019, Bioinform..

[61]  Cheng Liang,et al.  CeModule: an integrative framework for discovering regulatory patterns from genomic data in cancer , 2019, BMC Bioinformatics.

[62]  Fei Wang,et al.  miRTarBase 2020: updates to the experimentally validated microRNA–target interaction database , 2019, Nucleic Acids Res..