Predicting lncRNA–Protein Interactions With miRNAs as Mediators in a Heterogeneous Network Model

Long non-coding RNAs (lncRNAs) play important roles in various biological processes, where lncRNA–protein interactions are usually involved. Therefore, identifying lncRNA–protein interactions is of great significance to understand the molecular functions of lncRNAs. Since the experiments to identify lncRNA–protein interactions are always costly and time consuming, computational methods are developed as alternative approaches. However, existing lncRNA–protein interaction predictors usually require prior knowledge of lncRNA–protein interactions with experimental evidences. Their performances are limited due to the number of known lncRNA–protein interactions. In this paper, we explored a novel way to predict lncRNA–protein interactions without direct prior knowledge. MiRNAs were picked up as mediators to estimate potential interactions between lncRNAs and proteins. By validating our results based on known lncRNA–protein interactions, our method achieved an AUROC (Area Under Receiver Operating Curve) of 0.821, which is comparable to the state-of-the-art methods. Moreover, our method achieved an improved AUROC of 0.852 by further expanding the training dataset. We believe that our method can be a useful supplement to the existing methods, as it provides an alternative way to estimate lncRNA–protein interactions in a heterogeneous network without direct prior knowledge. All data and codes of this work can be downloaded from GitHub (https://github.com/zyk2118216069/LncRNA-protein-interactions-prediction).

[1]  Jeannie T. Lee,et al.  Long Noncoding RNAs: Past, Present, and Future , 2013, Genetics.

[2]  K. Chou,et al.  PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition. , 2014, Analytical biochemistry.

[3]  Lourdes Peña Castillo,et al.  Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins , 2009, Nature Biotechnology.

[4]  Junjie Chen,et al.  Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences , 2015, Nucleic Acids Res..

[5]  Qi Zhao,et al.  LPI-ETSLP: lncRNA-protein interaction prediction using eigenvalue transformation-based semi-supervised link prediction. , 2017, Molecular bioSystems.

[6]  Ao Li,et al.  Relevance search for predicting lncRNA-protein interactions based on heterogeneous network , 2016, Neurocomputing.

[7]  Jingpu Zhang,et al.  KATZLGO: Large-Scale Prediction of LncRNA Functions by Using the KATZ Measure Based on Multiple Networks , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Eric C. Lai,et al.  Endogenous small interfering RNAs in animals , 2008, Nature Reviews Molecular Cell Biology.

[9]  Kai-Wei Chang,et al.  RNA-binding proteins in human genetic disease. , 2008, Trends in genetics : TIG.

[10]  Christophe Dez,et al.  RNA structure and function in C/D and H/ACA s(no)RNPs. , 2004, Current opinion in structural biology.

[11]  Kuo-Chen Chou,et al.  iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition , 2017, Oncotarget.

[12]  Tyson A. Clark,et al.  HITS-CLIP yields genome-wide insights into brain alternative RNA processing , 2008, Nature.

[13]  Jordan M. Komisarow,et al.  RIP-Chip: the isolation and identification of mRNAs, microRNAs and protein components of ribonucleoprotein complexes from cell extracts , 2006, Nature Protocols.

[14]  T. Schedl,et al.  RNA-binding proteins. , 2006, WormBook : the online review of C. elegans biology.

[15]  Feng Huang,et al.  SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions , 2018, PLoS Comput. Biol..

[16]  Yanlin Chen,et al.  SFLLN: A sparse feature learning ensemble method with linear neighborhood regularization for predicting drug-drug interactions , 2019, Inf. Sci..

[17]  P. Stadler,et al.  RNA Maps Reveal New RNA Classes and a Possible Function for Pervasive Transcription , 2007, Science.

[18]  K. Chou Impacts of bioinformatics to medicinal chemistry. , 2015, Medicinal chemistry (Shariqah (United Arab Emirates)).

[19]  Ravinder Singh,et al.  RNA-protein interactions that regulate pre-mRNA splicing. , 2002, Gene expression.

[20]  Gene W. Yeo,et al.  Robust transcriptome-wide discovery of RNA binding protein binding sites with enhanced CLIP (eCLIP) , 2016, Nature Methods.

[21]  Mihaela Zavolan,et al.  Deciphering the role of RNA-binding proteins in the post-transcriptional control of gene expression. , 2010, Briefings in functional genomics.

[22]  Feng Huang,et al.  A Fast Linear Neighborhood Similarity-Based Network Link Inference Method to Predict MicroRNA-Disease Associations , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[23]  Jingpu Zhang,et al.  Prediction of lncRNA-protein interactions using HeteSim scores based on heterogeneous networks , 2017, Scientific Reports.

[24]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001 .

[25]  Tae-Kyung Kim,et al.  Emerging epigenetic mechanisms of long non-coding RNAs , 2014, Neuroscience.

[26]  Wei Zhao,et al.  A Brief Review on Software Tools in Generating Chou's Pseudo-factor Representations for All Types of Biological Sequences. , 2018, Protein and peptide letters.

[27]  Laurent Gil,et al.  Ensembl variation resources , 2010, BMC Genomics.

[28]  Xuegong Zhang,et al.  Computational prediction of associations between long non-coding RNAs and proteins , 2013, BMC Genomics.

[29]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[30]  Wen Zhang,et al.  The linear neighborhood propagation method for predicting long non-coding RNA-protein interactions , 2018, Neurocomputing.

[31]  Guanghai Yang,et al.  MicroRNA-21 (miR-21) represses tumor suppressor PTEN and promotes growth and invasion in non-small cell lung cancer (NSCLC). , 2010, Clinica chimica acta; international journal of clinical chemistry.

[32]  Donny D. Licatalosi,et al.  RNA processing and its regulation: global insights into biological networks , 2010, Nature Reviews Genetics.

[33]  Philip S. Yu,et al.  HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks , 2013, IEEE Transactions on Knowledge and Data Engineering.

[34]  Tatiana A. Tatusova,et al.  Gene: a gene-centered information resource at NCBI , 2014, Nucleic Acids Res..

[35]  K. Chou,et al.  Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. , 2015, Molecular bioSystems.

[36]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[37]  Sergio Verjovski-Almeida,et al.  Long intronic noncoding RNA transcription: expression noise or expression choice? , 2009, Genomics.

[38]  Scott B. Dewell,et al.  Transcriptome-wide Identification of RNA-Binding Protein and MicroRNA Target Sites by PAR-CLIP , 2010, Cell.

[39]  Xiang-Sun Zhang,et al.  De novo prediction of RNA-protein interactions from sequence information. , 2013, Molecular bioSystems.

[40]  Yang Wang,et al.  Fusing multiple protein-protein similarity networks to effectively predict lncRNA-protein interactions , 2017, BMC Bioinformatics.

[41]  V. Suresh,et al.  RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information , 2015, Nucleic acids research.

[42]  K. Morris,et al.  A pseudogene long noncoding RNA network regulates PTEN transcription and translation in human cells , 2013, Nature Structural &Molecular Biology.

[43]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[44]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[45]  Beth Israel,et al.  Decision letter: Replication Study: A coding-independent function of gene and pseudogene mRNAs regulates tumour biology , 2010 .

[46]  Qihong Huang,et al.  Pseudogene PTENP1 Functions as a Competing Endogenous RNA to Suppress Clear-Cell Renal Cell Carcinoma Progression , 2014, Molecular Cancer Therapeutics.

[47]  Wei Chen,et al.  PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions , 2015, Bioinform..

[48]  Yuchong Gong,et al.  A network embedding-based multiple information integration method for the MiRNA-disease association prediction , 2019, BMC Bioinformatics.

[49]  Laurent Gil,et al.  Ensembl variation resources , 2018, Database J. Biol. Databases Curation.

[50]  Yue Zhao,et al.  RAID v2.0: an updated resource of RNA-associated interactions across organisms , 2016, Nucleic Acids Res..

[51]  Hui Zhang,et al.  HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy , 2018, RNA biology.

[52]  Vasant Honavar,et al.  Predicting RNA-Protein Interactions Using Only Sequence Information , 2011, BMC Bioinformatics.

[53]  Ao Li,et al.  Predicting Long Noncoding RNA and Protein Interactions Using Heterogeneous Network Model , 2015, BioMed research international.

[54]  Shunmin He,et al.  NPInter v4.0: an integrated database of ncRNA interactions , 2019, Nucleic Acids Res..

[55]  Jernej Ule,et al.  CLIP: a method for identifying protein-RNA interaction sites in living cells. , 2005, Methods.

[56]  B. Peculis,et al.  RNA-binding proteins: If it looks like a sn(o)RNA… , 2000, Current Biology.

[57]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[58]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[59]  Ao Li,et al.  A Bipartite Network-based Method for Prediction of Long Non-coding RNA–protein Interactions , 2016, Genom. Proteom. Bioinform..