Predicting the Organelle Location of Noncoding RNAs Using Pseudo Nucleotide Compositions

Abstract Noncoding RNAs (ncRNAs) are implicated in various biological processes. Recent findings have demonstrated that the function of ncRNAs correlates with their provenance. Therefore, the recognition of ncRNAs from different organelle genomes will be helpful to understand their molecular functions. However, the weakness of experimental techniques limits the progress toward studying organellar ncRNAs and their functional relevance. As a complement of experiments, computational method provides an important choice to identify ncRNA in different organelles. Thus, a computational model was developed to identify ncRNAs from kinetoplast and mitochondrion organelle genomes. In this model, RNA sequences are encoded by “pseudo dinucleotide composition.” It was observed by the jackknife test that the overall success rate achieved by the proposed model was 90.08 %. We hope that the proposed method will be helpful in predicting ncRNA organellar locations.

[1]  F. J. Luque,et al.  The relative flexibility of B-DNA and A-RNA duplexes: database analysis. , 2004, Nucleic acids research.

[2]  J. Mattick Long noncoding RNAs in cell and developmental biology. , 2011, Seminars in cell & developmental biology.

[3]  V. Bajic,et al.  On the classification of long non-coding RNAs , 2013, RNA biology.

[4]  C. Ponting,et al.  Evolution and Functions of Long Noncoding RNAs , 2009, Cell.

[5]  Wei Chen,et al.  Prediction of replication origins by calculating DNA structural properties , 2012, FEBS letters.

[6]  Wei Chen,et al.  Prediction of CpG island methylation status by integrating DNA physicochemical properties. , 2014, Genomics.

[7]  Wei Chen,et al.  iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition , 2014, Bioinform..

[8]  P. Avner,et al.  2-D Structure of the A Region of Xist RNA and Its Implication for PRC2 Association , 2010, PLoS biology.

[9]  F. Luft,et al.  Long non-coding RNA in health and disease , 2014, Journal of Molecular Medicine.

[10]  Wei Chen,et al.  iNuc-PhysChem: A Sequence-Based Predictor for Identifying Nucleosomes via Physicochemical Properties , 2012, PloS one.

[11]  Hui Ding,et al.  AcalPred: A Sequence-Based Tool for Discriminating between Acidic and Alkaline Enzymes , 2013, PloS one.

[12]  K. Chou,et al.  iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. , 2013, Analytical biochemistry.

[13]  Wei Chen,et al.  Predicting cancerlectins by the optimal g-gap dipeptides , 2015, Scientific Reports.

[14]  Kuo-Chen Chou,et al.  Predicting membrane protein types by the LLDA algorithm. , 2008, Protein and peptide letters.

[15]  Monika J. Madej,et al.  Identification of small non-coding RNAs from mitochondria and chloroplasts , 2006, Nucleic acids research.

[16]  Karissa Y. Sanbonmatsu,et al.  Structural architecture of the human long non-coding RNA, steroid receptor RNA activator , 2012, Nucleic acids research.

[17]  Howard Y. Chang,et al.  Long noncoding RNAs and human disease. , 2011, Trends in cell biology.

[18]  K. Chou Some remarks on protein attribute prediction and pseudo amino acid composition , 2010, Journal of Theoretical Biology.

[19]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[20]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[21]  Pengmian Feng,et al.  Prediction of DNase I Hypersensitive Sites by Using Pseudo Nucleotide Compositions , 2014, TheScientificWorldJournal.

[22]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[23]  K. Chou,et al.  Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. , 2015, Molecular bioSystems.

[24]  Wei Chen,et al.  Identifying the Subfamilies of Voltage-Gated Potassium Channels Using Feature Selection Technique , 2014, International journal of molecular sciences.

[25]  Wei Chen,et al.  PseKNC-General: a cross-platform package for generating various modes of pseudo nucleotide compositions , 2015, Bioinform..

[26]  Wei Chen,et al.  Predicting the Types of J-Proteins Using Clustered Amino Acids , 2014, BioMed research international.

[27]  Xiaojun Xu,et al.  Physics-based RNA structure prediction , 2015, Biophysics reports.

[28]  B. Liu,et al.  Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. , 2015, Journal of theoretical biology.

[29]  John S Mattick,et al.  Long noncoding RNAs in cell biology. , 2011, Seminars in cell & developmental biology.

[30]  K. Chou,et al.  iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. , 2015, Analytical biochemistry.

[31]  H. Ding,et al.  Identification of mitochondrial proteins of malaria parasite using analysis of variance , 2014, Amino Acids.

[32]  Wei Chen,et al.  iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition , 2013, Nucleic acids research.

[33]  K. Chou,et al.  iSS-PseDNC: Identifying Splicing Sites Using Pseudo Dinucleotide Composition , 2014, BioMed research international.

[34]  K. Chou,et al.  PseKNC: a flexible web server for generating pseudo K-tuple nucleotide composition. , 2014, Analytical biochemistry.

[35]  Wei Chen,et al.  iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. , 2014, Analytical biochemistry.

[36]  Wei Wu,et al.  NONCODEv4: exploring the world of long non-coding RNA genes , 2013, Nucleic Acids Res..