Feature-Based and String-Based Models for Predicting RNA-Protein Interaction †

In this work, we study two approaches for the problem of RNA-Protein Interaction (RPI). In the first approach, we use a feature-based technique by combining extracted features from both sequences and secondary structures. The feature-based approach enhanced the prediction accuracy as it included much more available information about the RNA-protein pairs. In the second approach, we apply search algorithms and data structures to extract effective string patterns for prediction of RPI, using both sequence information (protein and RNA sequences), and structure information (protein and RNA secondary structures). This led to different string-based models for predicting interacting RNA-protein pairs. We show results that demonstrate the effectiveness of the proposed approaches, including comparative results against leading state-of-the-art methods.

[1]  Jianyang Zeng,et al.  A deep learning framework for modeling structural features of RNA-binding protein targets , 2015, Nucleic acids research.

[2]  Hong-Bin Shen,et al.  RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach , 2016, BMC Bioinformatics.

[3]  Donald A. Adjeroh,et al.  Efficient pattern matching for RNA secondary structures , 2015, Theor. Comput. Sci..

[4]  Leopold Parts,et al.  Computational biology: deep learning , 2017, Emerging topics in life sciences.

[5]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[6]  V. Suresh,et al.  RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information , 2015, Nucleic acids research.

[7]  Vasant Honavar,et al.  PRIDB: a protein–RNA interface database , 2010, Nucleic Acids Res..

[8]  Vasant Honavar,et al.  Predicting RNA-Protein Interactions Using Only Sequence Information , 2011, BMC Bioinformatics.

[9]  Donald A. Adjeroh,et al.  Suffix-Sorting via Shannon-Fano-Elias Codes , 2010, Algorithms.

[10]  Chris Sander,et al.  Objectively judging the quality of a protein structure from a Ramachandran plot , 1997, Comput. Appl. Biosci..

[11]  Gabriele Ausiello,et al.  A novel approach to represent and compare RNA secondary structures , 2014, Nucleic acids research.

[12]  E. Jankowsky,et al.  Specificity and nonspecificity in RNA–protein interactions , 2015, Nature Reviews Molecular Cell Biology.

[13]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[14]  C. Etchebest,et al.  Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks , 2000, Proteins.

[15]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[16]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[17]  Chih-Hung Chang,et al.  Protein structural similarity search by Ramachandran codes , 2007, BMC Bioinformatics.

[18]  G. N. Ramachandran,et al.  Stereochemical criteria for polypeptide and protein chain conformations. 3. Helical and hydrogen-bonded polypeptide chains. , 1966, Biophysical journal.

[19]  Xiang-Sun Zhang,et al.  De novo prediction of RNA-protein interactions from sequence information. , 2013, Molecular bioSystems.

[20]  Xuegong Zhang,et al.  Computational prediction of associations between long non-coding RNAs and proteins , 2013, BMC Genomics.

[21]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[22]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[23]  Hong-Bin Shen,et al.  IPMiner: hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction , 2016, BMC Genomics.

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  Eran Segal,et al.  Computational prediction of RNA structural motifs involved in posttranscriptional regulatory processes , 2008, Proceedings of the National Academy of Sciences.

[26]  Susan Jones,et al.  ProtorP: a protein-protein interaction analysis server , 2009, Bioinform..

[27]  Ahmad M Khalil,et al.  RNA-protein interactions in human health and disease. , 2011, Seminars in cell & developmental biology.

[28]  Alex Zhavoronkov,et al.  Applications of Deep Learning in Biomedicine. , 2016, Molecular pharmaceutics.

[29]  Henning Hermjakob,et al.  Analyzing protein-protein interaction networks. , 2012, Journal of proteome research.