BiRWLGO: A global network-based strategy for lncRNA function annotation using bi-random walk

A large number of long non-coding RNAs (lncRNAs) have been identified over the past decades. Accumulating evidence proves that lncRNAs play key roles in various biological processes. However, the majority of the lncRNAs have not been functionally characterized. The annotation of lncRNA functions has become an area of focus in the fields of biology and bioinformatics. In this paper, we develop a global network-based strategy, BiRWLGO, to predict probable functions for lncRNAs at large scale. In BiRWLGO, we first build a global network consisting of three networks: lncRNA-lncRNA similarity network, lncRNA-protein interaction network and protein-protein interaction network. Then the bi-random walk algorithm is applied to explore similarities between lncRNAs and proteins. The functions of a query lncRNA can be obtained according to the Gene Ontology (GO) terms of its neighboring proteins. We compare the performance of BiRWLGO with other state-of-the-art approaches on a manually annotated lncRNA benchmark with known GO terms. As a result, BiRWLGO achieves the best predictive performance in terms of both maximum F-measure (Fmax) and coverage. Moreover, we demonstrate that integrating the protein-protein interactions can help improve the predictive performance of lncRNA functions.

[1]  D. Bartel,et al.  Long noncoding RNAs in C. elegans , 2012, Genome research.

[2]  Rachael P. Huntley,et al.  The Gene Ontology Annotation (GOA) Database , 2009 .

[3]  Maoqiang Xie,et al.  Prioritizing Disease Genes by Bi-Random Walk , 2012, PAKDD.

[4]  R. Kuang,et al.  Network-based Phenome-Genome Association Prediction by Bi-Random Walk , 2015, PloS one.

[5]  Hui Zhou,et al.  ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data , 2016, Nucleic Acids Res..

[6]  Zhigang Chen,et al.  An Integrated Framework for Functional Annotation of Protein Structural Domains , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Xiangxiang Zeng,et al.  Inferring MicroRNA-Disease Associations by Random Walk on a Heterogeneous Network with Multiple Data Sources , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[9]  Xiaoke Ma,et al.  Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks , 2012, Nucleic acids research.

[10]  Marcel E. Dinger,et al.  lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs , 2014, Nucleic Acids Res..

[11]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[12]  E. Vigorito,et al.  Noncoding RNA and its associated proteins as regulatory elements of the immune system , 2014, Nature Immunology.

[13]  Quan Zou,et al.  Which statistical significance test best detects oncomiRNAs in cancer tissues? An exploratory analysis , 2016, Oncotarget.

[14]  Howard Y. Chang,et al.  Long noncoding RNAs and human disease. , 2011, Trends in cell biology.

[15]  Wei Wu,et al.  NONCODE 2016: an informative and valuable data source of long non-coding RNAs , 2015, Nucleic Acids Res..

[16]  Manolis Kellis,et al.  Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals , 2014, Genome research.

[17]  Jiajie Peng,et al.  LncRNA2Function: a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data , 2015, BMC Genomics.

[18]  Yukihide Tomari,et al.  Elements and machinery of non‐coding RNAs: toward their taxonomy , 2014, EMBO reports.

[19]  Shane J. Neph,et al.  A comparative encyclopedia of DNA elements in the mouse genome , 2014, Nature.

[20]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[21]  Lennart Martens,et al.  LNCipedia: a database for annotated human lncRNA transcript sequences and structures , 2012, Nucleic Acids Res..

[22]  Tim R. Mercer,et al.  NRED: a database of long noncoding RNA expression , 2008, Nucleic Acids Res..

[23]  Qifang Liu,et al.  Align human interactome with phenome to identify causative genes and networks underlying disease families , 2009, Bioinform..

[24]  M. Gerstein,et al.  Annotating non-coding regions of the genome , 2010, Nature Reviews Genetics.

[25]  Dennis B. Troup,et al.  NCBI GEO: mining tens of millions of expression profiles—database and tools update , 2006, Nucleic Acids Res..

[26]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[27]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[28]  Ke Liu,et al.  Linc2GO: a human LincRNA function annotation resource based on ceRNA hypothesis , 2013, Bioinform..

[29]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[30]  Gajendra P. S. Raghava,et al.  lncRNome: a comprehensive knowledgebase of human long noncoding RNAs , 2013, Database J. Biol. Databases Curation.

[31]  Hsien-Da Huang,et al.  lncRNAMap: A map of putative regulatory functions in the long non-coding transcriptome , 2014, Comput. Biol. Chem..

[32]  C. Mungall,et al.  Gene Ontology Consortium : going forward The Gene Ontology , 2015 .

[33]  Jingpu Zhang,et al.  KATZLGO: Large-Scale Prediction of LncRNA Functions by Using the KATZ Measure Based on Multiple Networks , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Kengo Kinoshita,et al.  COXPRESdb in 2015: coexpression database for animal species by DNA-microarray and RNAseq-based expression data with multiple quality assessment systems , 2014, Nucleic Acids Res..

[35]  K. Morris,et al.  The rise of regulatory RNA , 2014, Nature Reviews Genetics.

[36]  Wei Wu,et al.  NPInter v3.0: an upgraded database of noncoding RNA-associated interactions , 2016, Database J. Biol. Databases Curation.

[37]  R. Spizzo,et al.  Long non-coding RNAs and cancer: a new frontier of translational research? , 2012, Oncogene.