Constructive Prediction of Potential RNA Aptamers for a Protein Target

Aptamers are short single-stranded nucleic acids that bind to target molecules with high affinity and selectivity. Aptamers are generally identified in vitro by performing SELEX (systematic evolution of ligands by exponential enrichment). Complementing the SELEX process, several computational methods have been proposed in the search for aptamers. However, many of these methods cannot be applied for finding new aptamers, either because they are classifiers for determining whether an RNA and protein interact with each other, or because they are limited to a specific target only. Hence, we developed a new random forest (RF) model for finding potential RNA aptamers for a protein target. From an extensive analysis of protein-RNA complexes including RNA aptamers-protein complexes, we identified key features of interacting RNA and protein molecules, and structural constraints on RNA aptamers. The potential RNA aptamers predicted by our method reveal similar secondary and protein-binding structures as the actual RNA aptamers. The RF model showed a reliable performance in both cross validations and independent testing. The key features of interacting RNA and protein molecules and the structural constraints identified in our study were effective in finding potential aptamers for a protein target. Although preliminary, our results are promising, and we believe this approach will be useful in reducing time and money spent on in vitro experiments by substantially limiting the size of the initial pool of nucleic acid sequences.

[1]  Zhu-Hong You,et al.  Predicting Protein-Protein Interactions from Primary Protein Sequences Using a Novel Multi-Scale Local Feature Representation Scheme and the Random Forest , 2015, PloS one.

[2]  L. Gold,et al.  Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. , 1990, Science.

[3]  Chengjin Zhang,et al.  Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes , 2016, BMC Bioinformatics.

[4]  Obdulia Rabal,et al.  In Silico Aptamer Docking Studies: From a Retrospective Validation to a Prospective Case Study'TIM3 Aptamers Binding. , 2016, Molecular therapy. Nucleic acids.

[5]  Kyungsook Han,et al.  Predicting protein-binding RNA nucleotides using the feature-based removal of data redundancy and the interaction propensity of nucleotide triplets , 2013, Comput. Biol. Medicine.

[6]  Xiaoping Zhou,et al.  A generalized approach to predicting protein-protein interactions between virus and host , 2018 .

[7]  M. Higgins Data for Biochemical Research, 3rd Edition , 1987 .

[8]  Andrew D. Ellington,et al.  Nucleic Acid Selection and the Challenge of Combinatorial Chemistry. , 1997, Chemical reviews.

[9]  James M. Carothers,et al.  Informational Complexity and Functional Activity of RNA Structures , 2004, Journal of the American Chemical Society.

[10]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Yanga Byun,et al.  PseudoViewer3: generating planar drawings of large-scale RNA structures with pseudoknots , 2009, Bioinform..

[12]  Pei Zhou,et al.  HDOCK: a web server for protein–protein and protein–DNA/RNA docking based on a hybrid strategy , 2017, Nucleic Acids Res..

[13]  M F Kubik,et al.  Oligonucleotide inhibitors of human thrombin that bind distinct epitopes. , 1997, Journal of molecular biology.

[14]  M. Stone,et al.  In silico selection of RNA aptamers , 2009, Nucleic acids research.

[15]  Youli Zu,et al.  Aptamers and their applications in nanomedicine. , 2015, Small.

[16]  H. Mohabatkar,et al.  Predicting anticancer peptides with Chou's pseudo amino acid composition and investigating their mutagenicity via Ames test. , 2014, Journal of theoretical biology.

[17]  K. Chou,et al.  iRSpot-TNCPseAAC: Identify Recombination Spots with Trinucleotide Composition and Pseudo Amino Acid Components , 2014, International journal of molecular sciences.

[18]  Chen Yonghui,et al.  Real-time and label-free detection of bisphenol A by an ssDNA aptamer sensor combined with dual polarization interferometry , 2018 .

[19]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[20]  K. Chou,et al.  PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. , 2008, Analytical biochemistry.

[21]  Sheela M. Waugh,et al.  2′-Fluoropyrimidine RNA-based Aptamers to the 165-Amino Acid Form of Vascular Endothelial Growth Factor (VEGF165) , 1998, The Journal of Biological Chemistry.

[22]  A. Archakov,et al.  Computer-aided design of aptamers for cytochrome p450. , 2015, Journal of structural biology.

[23]  Yu-Dong Cai,et al.  Prediction of Aptamer-Target Interacting Pairs with Pseudo-Amino Acid Composition , 2014, PloS one.

[24]  Byungkyu Brian Park,et al.  Predicting protein-binding RNA nucleotides with consideration of binding partners , 2015, Comput. Methods Programs Biomed..

[25]  Wei Chen,et al.  iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. , 2014, Analytical biochemistry.

[26]  Wenzheng Bao,et al.  LAIPT: Lysine Acetylation Site Identification with Polynomial Tree , 2018, International journal of molecular sciences.

[27]  Jangam Vikram Kumar,et al.  Computational Selection of RNA Aptamer against Angiopoietin-2 and Experimental Evaluation , 2015, BioMed research international.

[28]  Srinivasan Ramachandran,et al.  In silico selection of an aptamer to estrogen receptor alpha using computational docking employing estrogen response elements as aptamer-alike molecules , 2016, Scientific Reports.

[29]  Nan Zhang,et al.  In vitro selection of DNA aptamers recognizing drug-resistant ovarian cancer by cell-SELEX. , 2019, Talanta.

[30]  K. R. Woods,et al.  Prediction of protein antigenic determinants from amino acid sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[32]  De-shuang Huang,et al.  Inference of Large-scale Time-delayed Gene Regulatory Network with Parallel MapReduce Cloud Platform , 2018, Scientific Reports.

[33]  Kyungsook Han,et al.  Prediction of RNA-binding amino acids from protein and RNA sequences , 2011, BMC Bioinformatics.

[34]  Byungkyu Brian Park,et al.  Predicting protein-binding regions in RNA using nucleotide profiles and compositions , 2017, BMC Systems Biology.

[35]  Xiang Zhou,et al.  An improved method for predicting interactions between virus and human proteins , 2017, J. Bioinform. Comput. Biol..

[36]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[37]  Jacek Blazewicz,et al.  Automated 3D structure composition for large RNAs , 2012, Nucleic acids research.

[38]  C. Tanford Contribution of Hydrophobic Interactions to the Stability of the Globular Conformation of Proteins , 1962 .

[39]  J. Szostak,et al.  In vitro selection of RNA molecules that bind specific ligands , 1990, Nature.

[40]  Michel Dumontier,et al.  Aptamer base: a collaborative knowledge base to describe aptamers and SELEX experiments , 2012, Database J. Biol. Databases Curation.