Systemically identifying and prioritizing risk lncRNAs through integration of pan-cancer phenotype associations

LncRNAs have emerged as a major class of regulatory molecules involved in normal cellular physiology and disease, our knowledge of lncRNAs is very limited and it has become a major research challenge in discovering novel disease-related lncRNAs in cancers. Based on the assumption that diverse diseases with similar phenotype associations show similar molecular mechanisms, we presented a pan-cancer network-based prioritization approach to systematically identify disease-specific risk lncRNAs by integrating disease phenotype associations. We applied this strategy to approximately 2800 tumor samples from 14 cancer types for prioritizing disease risk lncRNAs. Our approach yielded an average area under the ROC curve (AUC) of 80.66%, with the highest AUC (98.14%) for medulloblastoma. When evaluated using leave-one-out cross-validation (LOOCV) for prioritization of disease candidate genes, the average AUC score of 97.16% was achieved. Moreover, we demonstrated the robustness as well as the integrative importance of this approach, including disease phenotype associations, known disease genes and the numbers of cancer types. Taking glioblastoma multiforme as a case study, we identified a candidate lncRNA gene SNHG1 as a novel disease risk factor for disease diagnosis and prognosis. In summary, we provided a novel lncRNA prioritization approach by integrating pan-cancer phenotype associations that could help researchers better understand the important roles of lncRNAs in human cancers.

[1]  Qionghai Dai,et al.  WBSMDA: Within and Between Score for MiRNA-Disease Association prediction , 2016, Scientific Reports.

[2]  Wei Wu,et al.  Transcriptional profiling analysis and functional prediction of long noncoding RNAs in cancer , 2016, Oncotarget.

[3]  Xing Chen,et al.  Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA , 2015, Scientific Reports.

[4]  Qionghai Dai,et al.  Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity , 2015, Scientific Reports.

[5]  M. Hajjari,et al.  HOTAIR: an oncogenic long non-coding RNA in different cancers , 2015, Cancer biology & medicine.

[6]  Dapeng Hao,et al.  Prioritizing candidate disease-related long non-coding RNAs by walking on the heterogeneous lncRNA and disease network. , 2015, Molecular bioSystems.

[7]  Jiajie Peng,et al.  LncRNA2Function: a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data , 2015, BMC Genomics.

[8]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[9]  Robert Petryszak,et al.  ArrayExpress update—simplifying data submissions , 2014, Nucleic Acids Res..

[10]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[11]  Xiaohua Hu,et al.  Visualization of genetic disease-phenotype similarities by multiple maps t-SNE with Laplacian regularization , 2014, BMC Medical Genomics.

[12]  Jiancheng Luo,et al.  Comprehensive characterization of cancer subtype associated long non-coding RNAs and their clinical implications , 2014, Scientific Reports.

[13]  Yun Xiao,et al.  Prioritizing candidate disease miRNAs by integrating phenotype associations of multiple diseases with matched miRNA and mRNA expression profiles. , 2014, Molecular bioSystems.

[14]  Q. Cui,et al.  A bioinformatics method for predicting long noncoding RNAs associated with vascular disease , 2014, Science China Life Sciences.

[15]  Lin Liu,et al.  Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network. , 2014, Molecular bioSystems.

[16]  George A Calin,et al.  Long noncoding RNA in prostate, bladder, and kidney cancer. , 2014, European urology.

[17]  Xinghua Shi,et al.  A Network Based Method for Analysis of lncRNA-Disease Associations and Prediction of lncRNAs Implicated in Diseases , 2014, PloS one.

[18]  Xing Chen,et al.  A Computational Framework to Infer Human Disease-Associated Long Noncoding RNAs , 2014, PloS one.

[19]  Xing Chen,et al.  Novel human lncRNA-disease association inference based on lncRNA expression profiles , 2013, Bioinform..

[20]  Bangshun He,et al.  Analysis of long non-coding RNA expression profiles in gastric cancer. , 2013, World journal of gastroenterology.

[21]  M. Hou,et al.  Long Noncoding RNAs-Related Diseases, Cancers, and Drugs , 2013, TheScientificWorldJournal.

[22]  Zhen Su,et al.  Integrative genomic analyses reveal clinically relevant long non-coding RNA in human cancer , 2013 .

[23]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[24]  Xiaoke Ma,et al.  Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks , 2012, Nucleic acids research.

[25]  Anderson Chun On Tsang,et al.  Long non-coding RNA expression profiles predict clinical phenotypes in glioma , 2012, Neurobiology of Disease.

[26]  Shizuka Uchida,et al.  Noncoder: a web interface for exon array-based detection of long non-coding RNAs , 2012, Nucleic acids research.

[27]  Rory Johnson Long non-coding RNAs in Huntington's disease neurodegeneration , 2012, Neurobiology of Disease.

[28]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[29]  Yun Xiao,et al.  Differential expression pattern-based prioritization of candidate genes through integrating disease-specific expression data. , 2011, Genomics.

[30]  Leonard Lipovich,et al.  Mining Affymetrix microarray data for long non‐coding RNAs: altered expression in the nucleus accumbens of heroin abusers , 2011, Journal of neurochemistry.

[31]  I. Bièche,et al.  ANRIL, a long, noncoding RNA, is an unexpected major hotspot in GWAS , 2011, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[32]  Shuli Kang,et al.  Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network , 2011, Nucleic acids research.

[33]  E. Chesler,et al.  Sex-specific gene expression in the BXD mouse liver. , 2010, Physiological genomics.

[34]  J. Mattick,et al.  Non‐coding RNAs: regulators of disease , 2010, The Journal of pathology.

[35]  I Jolanda M de Vries,et al.  Regulation of MYCN expression in human neuroblastoma cells , 2009, BMC Cancer.

[36]  D. Spector,et al.  Long noncoding RNAs: functional surprises from the RNA world. , 2009, Genes & development.

[37]  G. Scagliotti,et al.  Non-small cell lung cancer exhibits transcript overexpression of genes associated with homologous recombination and DNA replication pathways. , 2009, Cancer research.

[38]  A. Gabory,et al.  The H19 locus acts in vivo as a tumor suppressor , 2008, Proceedings of the National Academy of Sciences.

[39]  S. Sunkin,et al.  Specific expression of long noncoding RNAs in the mouse brain , 2008, Proceedings of the National Academy of Sciences.

[40]  K. Pritchard-Jones,et al.  Alternately spliced WT1 antisense transcripts interact with WT1 sense RNA and show epigenetic and splicing defects in cancer. , 2007, RNA.

[41]  A. Hochberg,et al.  The H19 Non-Coding RNA Is Essential for Human Tumor Growth , 2007, PloS one.

[42]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[43]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[44]  E. Koonin,et al.  Conservation and coevolution in the scale-free human gene coexpression network. , 2004, Molecular biology and evolution.

[45]  Homin K. Lee,et al.  Coexpression analysis of human genes across many microarray data sets. , 2004, Genome research.

[46]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[47]  Jian-min Yang,et al.  Effects of insulin‐like growth factors‐IR and ‐IIR antisense gene transfection on the biological behaviors of SMMC‐7721 human hepatoma cells , 2003, Journal of gastroenterology and hepatology.

[48]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Jing Bai,et al.  Identification of cancer-related lncRNAs through integrating genome, regulome and transcriptome features. , 2015, Molecular bioSystems.

[50]  J. Cai,et al.  HOTAIR: a cancer-related long non-coding RNA. , 2014, Neoplasma.

[51]  N. Nomura,et al.  Complete sequencing and characterization of 21,243 full-length human cDNAs , 2004, Nature Genetics.