Predicting microRNA biological functions based on genes discriminant analysis

Although thousands of microRNAs (miRNAs) have been identified in recent experimental efforts, it remains a challenge to explore their specific biological functions through molecular biological experiments. Since those members from same family share same or similar biological functions, classifying new miRNAs into their corresponding families will be helpful for their further functional analysis. In this study, we initially built a vector space by characterizing the features from miRNA sequences and structures according to their miRBase family organizations. Then we further assigned miRNAs into its specific miRNA families by developing a novel genes discriminant analysis (GDA) approach in this study. As can be seen from the results of new families from GDA, in each of these new families, there was a high degree of similarity among all members of nucleotide sequences. At the same time, we employed 10-fold cross-validation machine learning to achieve the accuracy rates of 68.68%, 80.74%, and 83.65% respectively for the original miRNA families with no less than two, three, and four members. The encouraging results suggested that the proposed GDA could not only provide a support in identifying new miRNAs' families, but also contributing to predicting their biological functions.

[1]  Peter F. Stadler,et al.  The Expansion of Animal MicroRNA Families Revisited , 2015, Life.

[2]  M. Lawrence,et al.  Experimental discovery of sRNAs in Vibrio cholerae by direct cloning, 5S/tRNA depletion and parallel sequencing , 2009, Nucleic acids research.

[3]  J. Mendell,et al.  MicroRNAs in Stress Signaling and Human Disease , 2012, Cell.

[4]  Salvatore Alaimo,et al.  ncPred: ncRNA-Disease Association Prediction through Tripartite Network-Based Inference , 2014, Front. Bioeng. Biotechnol..

[5]  Jiaojiao Lin,et al.  Correction: MicroRNAs Are Involved in the Regulation of Ovary Development in the Pathogenic Blood Fluke Schistosoma japonicum , 2016, PLoS pathogens.

[6]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[7]  Anirban Dutta,et al.  FASTR: A novel data format for concomitant representation of RNA sequence and secondary structure information , 2015, Journal of Biosciences.

[8]  Norma I Rodríguez-Malavé,et al.  MicroRNAs in B cell development and malignancy , 2012, Journal of Hematology & Oncology.

[9]  M. Ohtsuka,et al.  CRISPR: a versatile tool for both forward and reverse genetics research , 2016, Human Genetics.

[10]  Lingling Hu,et al.  miRClassify: An advanced web server for miRNA family classification and annotation , 2014, Comput. Biol. Medicine.

[11]  Jay R Rajasekera,et al.  Approaches of discriminant analysis for data mining and management , 2003 .

[12]  P. Boersma,et al.  Breeding Patterns of Gal�pagos Penguins as an Indicator of Oceanographic Conditions , 1978, Science.

[13]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[14]  Qiao Wu,et al.  miR-15a-3p and miR-16-1-3p Negatively Regulate Twist1 to Repress Gastric Cancer Cell Invasion and Metastasis , 2017, International journal of biological sciences.

[15]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[16]  D. Bartel,et al.  Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. , 2005, RNA.

[17]  V. Ambros microRNAs Tiny Regulators with Great Potential , 2001, Cell.

[18]  Wentian Li,et al.  Principles for the organization of gene-sets , 2015, Comput. Biol. Chem..

[19]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[20]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[21]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[22]  Yoshua Bengio,et al.  No Unbiased Estimator of the Variance of K-Fold Cross-Validation , 2003, J. Mach. Learn. Res..

[23]  Jan-Peter Nap,et al.  In silico miRNA prediction in metazoan genomes: balancing between sensitivity and specificity , 2009, BMC Genomics.

[24]  V. Kim MicroRNA biogenesis: coordinated cropping and dicing , 2005, Nature Reviews Molecular Cell Biology.

[25]  G. Church,et al.  Computational and experimental identification of C. elegans microRNAs. , 2003, Molecular cell.

[26]  Xiaolong Wang,et al.  Using distances between Top-n-gram and residue pairs for protein remote homology detection , 2014, BMC Bioinformatics.

[27]  Vladimir B. Bajic,et al.  Exploration of miRNA families for hypotheses generation , 2013, Scientific Reports.

[28]  Kristin Reiche,et al.  Structural profiles of human miRNA families from pairwise clustering , 2009, Bioinform..

[29]  C. Burge,et al.  The microRNAs of Caenorhabditis elegans. , 2003, Genes & development.

[30]  Ping Li,et al.  Pituitary tumor-transforming gene 1 enhances metastases of cervical cancer cells through miR-3666-regulated ZEB1 , 2016, Tumor Biology.

[31]  Shuigeng Zhou,et al.  miRFam: an effective automatic miRNA classification method based on n-grams and a multiclass SVM , 2011, BMC Bioinformatics.

[32]  Byoung-Tak Zhang,et al.  Construction of microRNA functional families by a mixture model of position weight matrices , 2013, PeerJ.

[33]  Josh T. Cuperus,et al.  Evolution and Functional Diversification of MIRNA Genes , 2011, Plant Cell.

[34]  Christina Backes,et al.  miRPathDB: a new dictionary on microRNAs and target pathways , 2016, Nucleic Acids Res..

[35]  Tyler Risom,et al.  Evolutionary conservation of microRNA regulatory circuits: an examination of microRNA gene complexity and conserved microRNA-target interactions through metazoan phylogeny. , 2007, DNA and cell biology.

[36]  T. Tuschl,et al.  Mechanisms of gene silencing by double-stranded RNA , 2004, Nature.

[37]  Young-Kook Kim,et al.  Extracellular microRNAs as Biomarkers in Human Disease , 2015, Chonnam medical journal.

[38]  Santosh K. Mishra,et al.  De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures , 2007, Bioinform..

[39]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[40]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.