LPI-KTASLP: Prediction of LncRNA-Protein Interaction by Semi-Supervised Link Learning With Multivariate Information

Long non-coding RNA, also known as lncRNA, is a series of single-stranded polynucleotides (no less than 200 nucleotides each), consisting of non-protein coding transcripts. LncRNA plays a crucial role in regulating gene expression, during the transcriptional, post-transcriptional, and epigenetic processes. This is achieved by lncRNA interacts with the corresponding RNA-binding proteins. It has been drawn to a lot of attention that the reduction of the excessive laboratory cost and the increase in speed and accuracy gains benefits from the employment of computational intelligence in lncRNA–protein interaction (LPI) identification. Although numerous pertinent in silico studies of LPI prediction have been proposed, there is still room for enhancing the accuracy of the existing LPI prediction methods. In this paper, we have proposed a novel method for identifying LPI with kernel target alignment based on semi-supervised link prediction (LPI-KTASLP), which adopts multivariate information to predict lncRNAs–proteins interactions. To integrate the heterogeneous kernels, kernel target alignment has been applied to deal with kernel fusion. We have calculated the low-rank approximation matrices of lncRNA and protein, where eigendecomposition is used to reduce computing pressure. The prediction model has been obtained by producing the ultimate LPI prediction matrix. Experimental results show that the prediction ability of the LPI-KTASLP algorithm has surpassed many other LPI prediction schemes. Our method of lncRNA–protein interaction prediction has been evaluated on a standard benchmark dataset of LPIs. We have observed that the highest AUPR of 0.6148 is obtained by our proposed model (LPI-KTASLP). This is superior to the integrated LPLNP (AUPR: 0.4584), the RWR (AUPR: 0.2827), the CF (AUPR: 0.2357), the LPIHN (AUPR: 0.2299), and the LPBNI (AUPR: 0.3302). It is very encouraging that most of the LPI predictions have been confirmed to be close to relevant concentrations.

[1]  Daming Zhu,et al.  Structural neighboring property for identifying protein-protein binding sites , 2015, BMC Systems Biology.

[2]  Jijun Tang,et al.  Identification of Residue-Residue Contacts Using a Novel Coevolution- Based Method , 2016 .

[3]  Chandler Davis The norm of the Schur product operation , 1962 .

[4]  R. Levy,et al.  Simplified amino acid alphabets for protein fold recognition and implications for folding. , 2000, Protein engineering.

[5]  Mingxin Gan,et al.  Walking on a User Similarity Network towards Personalized Recommendations , 2014, PloS one.

[6]  Q. Zou,et al.  Prediction of MicroRNA-Disease Associations Based on Social Network Analysis Methods , 2015, BioMed research international.

[7]  Steven C. H. Hoi,et al.  Unsupervised Multiple Kernel Learning , 2011, ACML.

[8]  Feng Liu,et al.  Predicting drug-disease associations by using similarity constrained matrix factorization , 2018, BMC Bioinformatics.

[9]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[10]  Federico Agostini,et al.  Predicting protein associations with long noncoding RNAs , 2011, Nature Methods.

[11]  Jijun Tang,et al.  Identification of Protein–Protein Interactions via a Novel Matrix-Based Sequence Representation Model with Amino Acid Contact Information , 2016, International journal of molecular sciences.

[12]  Shih-Fu Chang,et al.  Fast kernel learning for spatial pyramid matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Lei Wang,et al.  BNPMDA: Bipartite Network Projection for MiRNA–Disease Association prediction , 2018, Bioinform..

[14]  Jijun Tang,et al.  Identification of drug-target interactions via multiple information integration , 2017, Inf. Sci..

[15]  Fei Guo,et al.  MDA-SKF: Similarity Kernel Fusion for Accurately Discovering miRNA-Disease Association , 2018, Front. Genet..

[16]  Xing Chen,et al.  MicroRNAs and complex diseases: from experimental results to computational models , 2019, Briefings Bioinform..

[17]  Minghong Jiang,et al.  Self-Recognition of an Inducible Host lncRNA by RIG-I Feedback Restricts Innate Immune Response , 2018, Cell.

[18]  Xiaohong Li,et al.  Feature-derived graph regularized matrix factorization for predicting drug side effects , 2018, Neurocomputing.

[19]  Jian Song,et al.  Identification of DNA–protein Binding Sites through Multi-Scale Local Average Blocks on Sequence Information , 2017, Molecules.

[20]  Terran Lane,et al.  A Framework for Multiple Kernel Support Vector Regression and Its Applications to siRNA Efficacy Prediction , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Fei Guo,et al.  Identification of Drug-Side Effect Association via Semisupervised Model and Multiple Kernel Learning , 2019, IEEE Journal of Biomedical and Health Informatics.

[22]  Nathalie Villa-Vialaneix,et al.  Unsupervised multiple kernel learning for heterogeneous data integration , 2017, bioRxiv.

[23]  Xinying Xu,et al.  An Ameliorated Prediction of Drug–Target Interactions Based on Multi-Scale Discrete Wavelet Transform and Network Features , 2017, International journal of molecular sciences.

[24]  Qi Zhao,et al.  IRWNRLPI: Integrating Random Walk and Neighborhood Regularized Logistic Matrix Factorization for lncRNA-Protein Interaction Prediction , 2018, Front. Genet..

[25]  Lusheng Wang,et al.  Protein-protein interface prediction based on hexagon structure similarity , 2016, Comput. Biol. Chem..

[26]  Ao Li,et al.  Predicting Long Noncoding RNA and Protein Interactions Using Heterogeneous Network Model , 2015, BioMed research international.

[27]  J. Rinn,et al.  Modular regulatory principles of large non-coding RNAs , 2012, Nature.

[28]  Jijun Tang,et al.  Identification of Protein-Ligand Binding Sites by Sequence Information and Ensemble Classifier , 2017, J. Chem. Inf. Model..

[29]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[30]  Ao Li,et al.  A Bipartite Network-based Method for Prediction of Long Non-coding RNA–protein Interactions , 2016, Genom. Proteom. Bioinform..

[31]  Xiang-Sun Zhang,et al.  De novo prediction of RNA-protein interactions from sequence information. , 2013, Molecular bioSystems.

[32]  Hisashi Kashima,et al.  Fast and Scalable Algorithms for Semi-supervised Link Prediction on Static and Dynamic Graphs , 2010, ECML/PKDD.

[33]  Jijun Tang,et al.  Predicting protein-protein interactions via multivariate mutual information of protein sequences , 2016, BMC Bioinformatics.

[34]  Hui Zhang,et al.  HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy , 2018, RNA biology.

[35]  Shruti Kapoor,et al.  Computational approaches towards understanding human long non-coding RNA biology , 2015, Bioinform..

[36]  Lusheng Wang,et al.  Identifying protein-protein binding sites with a combined energy function. , 2014, Current protein & peptide science.

[37]  Vasant Honavar,et al.  Predicting RNA-Protein Interactions Using Only Sequence Information , 2011, BMC Bioinformatics.

[38]  Xiangxiang Zeng,et al.  Inferring MicroRNA-Disease Associations by Random Walk on a Heterogeneous Network with Multiple Data Sources , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[39]  Wen Zhang,et al.  The linear neighborhood propagation method for predicting long non-coding RNA-protein interactions , 2018, Neurocomputing.

[40]  Lusheng Wang,et al.  Probabilistic Models for Capturing More Physicochemical Properties on Protein-Protein Interface , 2014, J. Chem. Inf. Model..

[41]  Kuo-Chen Chou,et al.  MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. , 2007, Biochemical and biophysical research communications.

[42]  Xiangrong Liu,et al.  An Empirical Study of Features Fusion Techniques for Protein-Protein Interaction Prediction , 2016 .

[43]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[44]  Qi Zhao,et al.  LPI-ETSLP: lncRNA-protein interaction prediction using eigenvalue transformation-based semi-supervised link prediction. , 2017, Molecular bioSystems.

[45]  Jijun Tang,et al.  Identification of drug-side effect association via multiple information integration with centered kernel alignment , 2019, Neurocomputing.

[46]  Wei Wu,et al.  NONCODEv4: exploring the world of long non-coding RNA genes , 2013, Nucleic Acids Res..

[47]  Yang Wang,et al.  Fusing multiple protein-protein similarity networks to effectively predict lncRNA-protein interactions , 2017, BMC Bioinformatics.

[48]  Wen Zhang,et al.  LPI-NRLMF: lncRNA-protein interaction prediction by neighborhood regularized logistic matrix factorization , 2017, Oncotarget.

[49]  Wei Wu,et al.  NPInter v2.0: an updated database of ncRNA interactions , 2013, Nucleic Acids Res..

[50]  V. Suresh,et al.  RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information , 2015, Nucleic acids research.

[51]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[52]  Xing Chen,et al.  EGBMMDA: Extreme Gradient Boosting Machine for MiRNA-Disease Association prediction , 2018, Cell Death & Disease.

[53]  Urbano Nunes,et al.  Trainable classifier-fusion schemes: An application to pedestrian detection , 2009, 2009 12th International IEEE Conference on Intelligent Transportation Systems.

[54]  Xue-wen Chen,et al.  On Position-Specific Scoring Matrix for Protein Function Prediction , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[55]  Xing Chen,et al.  LRSSLMDA: Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction , 2017, PLoS Comput. Biol..

[56]  C. Chothia,et al.  Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. , 2001, Journal of molecular biology.

[57]  Ivan G. Costa,et al.  A multiple kernel learning algorithm for drug-target interaction prediction , 2016, BMC Bioinformatics.

[58]  Xuegong Zhang,et al.  Computational prediction of associations between long non-coding RNAs and proteins , 2013, BMC Genomics.