Prediction of Plant lncRNA-Protein Interactions Using Sequence Information Based on Deep Learning

Plant long non-coding RNA (lncRNA) plays an important role in many biological processes, mainly through its interaction with RNA binding protein (RBP). To understand the function of lncRNA, a basic step is to determine which proteins are interacted with lncRNA. Therefore, RBP can be predicted by computational approaches. However, the main challenge is that it is difficult to find interaction patterns or primitives. In this study, we propose a method based on sequences to predict plant lncRNA-protein interaction, namely PLRPI uses k-mer frequency feature for RNA and protein, stacked denoising autoencoder and gradient boosting decision tree to learn the hidden interaction between plant lncRNAs and proteins sequences. The experimental results show that PLRPI achieves good performance on the test datasets ATH948 and ZEA22133 based on lncRNA-protein interaction of Arabidopsis thaliana and Zea mays. Our method gets an accuracy of 90.4% on ATH948 and 82.6% on ZEA22133. PLRPI is also superior to other methods in some public RNA-protein interaction datasets. The result shows PLRPI has strong generalization ability and high robustness. It is an effective model for predicting plant lncRNA-protein interactions.

[1]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[2]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[3]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Chengjin Zhang,et al.  Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes , 2016, BMC Bioinformatics.

[5]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[6]  Rolf Backofen,et al.  MechRNA: prediction of lncRNA mechanisms from RNA–RNA and RNA–protein interactions , 2018, bioRxiv.

[7]  V. Suresh,et al.  RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information , 2015, Nucleic acids research.

[8]  Timothy R. Hughes,et al.  High-throughput characterization of protein–RNA interactions , 2014, Briefings in functional genomics.

[9]  Na-Na Guan,et al.  Computational models for lncRNA function prediction and functional similarity calculation , 2018, Briefings in functional genomics.

[10]  Gabriele Varani,et al.  RNA is rarely at a loss for companions; as soon as RNA , 2008 .

[11]  Xuegong Zhang,et al.  Computational prediction of associations between long non-coding RNAs and proteins , 2013, BMC Genomics.

[12]  Vasant Honavar,et al.  Predicting RNA-Protein Interactions Using Only Sequence Information , 2011, BMC Bioinformatics.

[13]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[14]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[15]  Kai-Wei Chang,et al.  RNA-binding proteins in human genetic disease. , 2008, Trends in genetics : TIG.

[16]  Hong-Bin Shen,et al.  IPMiner: hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction , 2016, BMC Genomics.

[17]  Brendan J. Frey,et al.  Deep learning of the tissue-regulated splicing code , 2014, Bioinform..

[18]  E. Birney,et al.  Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs , 2002, Nature.

[19]  Brendan J. Frey,et al.  A compendium of RNA-binding motifs for decoding gene regulation , 2013, Nature.

[20]  Hongbin Shen,et al.  Large-scale prediction of human protein-protein interactions from amino acid sequence based on latent topic features. , 2010, Journal of proteome research.

[21]  Lili Wan,et al.  RNA and Disease , 2009, Cell.

[22]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[24]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[25]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[26]  Gabriele Varani,et al.  Protein families and RNA recognition , 2005, The FEBS journal.