Identifying 2'-O-methylationation sites by integrating nucleotide chemical properties and nucleotide compositions.

2'-O-methylationation is an important post-transcriptional modification and plays important roles in many biological processes. Although experimental technologies have been proposed to detect 2'-O-methylationation sites, they are cost-ineffective. As complements to experimental techniques, computational methods will facilitate the identification of 2'-O-methylationation sites. In the present study, we proposed a support vector machine-based method to identify 2'-O-methylationation sites. In this method, RNA sequences were formulated by nucleotide chemical properties and nucleotide compositions. In the jackknife cross-validation test, the proposed method obtained an accuracy of 95.58% for identifying 2'-O-methylationation sites in the human genome. Moreover, the model was also validated by identifying 2'-O-methylation sites in the Mus musculus and Saccharomyces cerevisiae genomes, and the obtained accuracies are also satisfactory. These results indicate that the proposed method will become a useful tool for the research on 2'-O-methylation.

[1]  Jie Wu,et al.  RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data , 2015, Nucleic Acids Res..

[2]  Ho-Jin Choi,et al.  DNA Encoding for Splice Site Prediction in Large DNA Sequence , 2013, DASFAA Workshops.

[3]  Wei Chen,et al.  Identification of voltage-gated potassium channel subfamilies from sequence information using support vector machine , 2012, Comput. Biol. Medicine.

[4]  Wei Chen,et al.  Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome , 2015, Scientific Reports.

[5]  K. Chou,et al.  iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. , 2015, Analytical biochemistry.

[6]  J. Steitz,et al.  A new method for detecting sites of 2'-O-methylation in RNA molecules. , 1997, RNA.

[7]  Junjie Chen,et al.  Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences , 2015, Nucleic Acids Res..

[8]  Xiangxiang Zeng,et al.  Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks , 2016, Briefings Bioinform..

[9]  Ke Chen,et al.  Survey of MapReduce frame operation in bioinformatics , 2013, Briefings Bioinform..

[10]  Manish Kumar,et al.  Prediction of β-lactamase and its class by Chou's pseudo-amino acid composition and support vector machine. , 2015, Journal of theoretical biology.

[11]  Liang-Hu Qu,et al.  RTL-P: a sensitive approach for detecting sites of 2′-O-methylation in RNA molecules , 2012, Nucleic acids research.

[12]  Burkhard Ludewig,et al.  Ribose 2′-O-methylation provides a molecular signature for the distinction of self and non-self mRNA dependent on the RNA sensor Mda5 , 2011, Nature Immunology.

[13]  Wayne A. Decatur,et al.  rRNA modifications and ribosome function. , 2002, Trends in biochemical sciences.

[14]  Hao Lin,et al.  Prediction of ketoacyl synthase family using reduced amino acid alphabets , 2012, Journal of Industrial Microbiology & Biotechnology.

[15]  Hui Ding,et al.  AcalPred: A Sequence-Based Tool for Discriminating between Acidic and Alkaline Enzymes , 2013, PloS one.

[16]  Q. Zou,et al.  Prediction of MicroRNA-Disease Associations Based on Social Network Analysis Methods , 2015, BioMed research international.

[17]  K. Chou,et al.  iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. , 2013, Analytical biochemistry.

[18]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[19]  D. Boisvert,et al.  Crystal structure of a fibrillarin homologue from Methanococcus jannaschii, a hyperthermophile, at 1.6 Å resolution , 2000, The EMBO journal.

[20]  Wei Chen,et al.  Exon skipping event prediction based on histone modifications , 2014, Interdisciplinary Sciences: Computational Life Sciences.

[21]  P. Ajuh,et al.  Chemical secondary structure probing of two highly methylated regions in Xenopus laevis 28S ribosomal RNA. , 1994, Biochimica et biophysica acta.

[22]  Wei Chen,et al.  Prediction of replication origins by calculating DNA structural properties , 2012, FEBS letters.

[23]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[24]  K. Chou,et al.  iSS-PseDNC: Identifying Splicing Sites Using Pseudo Dinucleotide Composition , 2014, BioMed research international.

[25]  A. Hüttenhofer,et al.  The expanding snoRNA world. , 2002, Biochimie.

[26]  Wei Chen,et al.  Prediction of CpG island methylation status by integrating DNA physicochemical properties. , 2014, Genomics.

[27]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[28]  B. Maden,et al.  Mapping 2'-O-methyl groups in ribosomal RNA. , 2001, Methods.

[29]  Xiaolong Wang,et al.  iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach , 2016, Journal of biomolecular structure & dynamics.

[30]  Wei Chen,et al.  iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. , 2014, Analytical biochemistry.