Unification of miRNA and isomiR research: the mirGFF3 format and the mirtop API

Background MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, a lack of consensus on miRNA/isomiR analyses exists, and the resulting diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods. Findings To overcome this situation, we present here a community-based project, miRTOP (miRNA Transcriptomic Open Project) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream analysis tools that are compatible with any existing detection and quantification tool. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, ‘mirtop’, to manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, miRge2.0, isomiR-SEA, sRNAbench, and Prost!, as well as BAM files. Its open architecture enables any tool or pipeline to output results in mirGFF3. Conclusions Collectively a comprehensive isomiR categorization system, along with the accompanying mirGFF3 and mirtop API provide a complete solution for the standardization of miRNA and isomiR analysis, enabling data sharing, reporting, comparative analyses, and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps to miRNA detection, annotation, and quantification.

[1]  Ángel M. Alganza,et al.  sRNAbench: profiling of small RNAs and its sequence variants in single or multi-species high-throughput experiments , 2014 .

[2]  Isidore Rigoutsos,et al.  MINTbase: a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments , 2016, Bioinform..

[3]  Wei Zhu,et al.  Plasma miRNAs in diagnosis and prognosis of pancreatic cancer: A miRNA expression analysis. , 2018, Gene.

[4]  A. Hatzigeorgiou,et al.  Redirection of Silencing Targets by Adenosine-to-Inosine Editing of miRNAs , 2007, Science.

[5]  Alexander S. Baras,et al.  miRge 2.0 for comprehensive analysis of microRNA sequencing data , 2018, BMC Bioinformatics.

[6]  Xavier Estivill,et al.  SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells , 2009, Nucleic acids research.

[7]  Phillipe Loher,et al.  Profiles of miRNA Isoforms and tRNA Fragments in Prostate Cancer , 2018, Scientific Reports.

[8]  Andrea Acquaviva,et al.  isomiR-SEA: an RNA-Seq analysis tool for miRNAs/isomiRs expression level profiling and miRNA-mRNA interaction sites evaluation , 2016, BMC Bioinformatics.

[9]  E. Wentzel,et al.  A Hexanucleotide Element Directs MicroRNA Nuclear Import , 2007, Science.

[10]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[11]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[12]  G. Hannon,et al.  Processing of primary microRNAs by the Microprocessor complex , 2004, Nature.

[13]  Massimiliano Izzo,et al.  FAIRsharing: working with and for the community to describe and link data standards, repositories and policies , 2018 .

[14]  R Lyle,et al.  PO-096 Natural variation in serum small non-coding RNAs – potential biomarkers of cancer , 2018, ESMO Open.

[15]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[16]  Yi Jing,et al.  Beyond the one-locus-one-miRNA paradigm: microRNA isoforms enable deeper insights into breast cancer heterogeneity , 2015, Nucleic acids research.

[17]  D. Greco,et al.  Strong conservation of inbred mouse strain microRNA loci but broad variation in brain microRNAs due to RNA editing and isomiR expression , 2018, RNA.

[18]  Phillipe Loher,et al.  Reply to Backes and Keller: Identification of novel tissue-specific and primate-specific human microRNAs , 2015, Proceedings of the National Academy of Sciences.

[19]  F. Slack,et al.  Architecture of a validated microRNA::target interaction. , 2004, Chemistry & biology.

[20]  Knut Reinert,et al.  The SeqAn C++ template library for efficient sequence analysis: A resource for programmers. , 2017, Journal of biotechnology.

[21]  E. Hovig,et al.  A Uniform System for the Annotation of Vertebrate microRNA Genes and the Evolution of the Human microRNAome. , 2015, Annual review of genetics.

[22]  Zikang Zhang,et al.  Circular RNA: new star, new hope in cancer , 2018, BMC Cancer.

[23]  Steve Pettifer,et al.  EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats , 2013, Bioinform..

[24]  C. Langford,et al.  5′ isomiR variation is of functional and evolutionary importance , 2014, Nucleic acids research.

[25]  J. Postlethwait,et al.  miRNA analysis with Prost! reveals evolutionary conservation of organ-enriched expression and post-transcriptional modifications in three-spined stickleback and zebrafish , 2018, Scientific Reports.

[26]  Piotr Zielenkiewicz,et al.  Tools4miRs – one place to gather all the tools for miRNA analysis , 2016, Bioinform..

[27]  Doron Betel,et al.  Widespread regulatory activity of vertebrate microRNA* species. , 2011, RNA.

[28]  Scott Cain,et al.  GMODWeb: a web framework for the generic model organism database , 2008, Genome Biology.

[29]  Kevin Chen,et al.  QuagmiR: a cloud-based application for isomiR big data analytics , 2018, Bioinform..

[30]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[31]  Ying Sun,et al.  A four‐miRNA signature identified from genome‐wide serum miRNA profiling predicts survival in patients with nasopharyngeal carcinoma , 2014, International journal of cancer.

[32]  International Human Genome Sequencing Consortium Finishing the euchromatic sequence of the human genome , 2004 .

[33]  B. Frank Eames,et al.  miRNA analysis with Prost! reveals evolutionary conservation of organ-enriched expression and post-transcriptional modifications in three-spined stickleback and zebrafish , 2018 .

[34]  Diana Domanska,et al.  MirGeneDB 2.0: the metazoan microRNA complement , 2018, bioRxiv.

[35]  O. Myklebost,et al.  Analysis of the miR-34 family functions in breast cancer reveals annotation error of miR-34b , 2017, Scientific Reports.

[36]  Michael Hackenberg,et al.  sRNAtoolbox: an integrated collection of small RNA research tools , 2015, Nucleic Acids Res..

[37]  Yvonne Tay,et al.  MicroRNAs to Nanog, Oct4 and Sox2 coding regions modulate embryonic stem cell differentiation , 2008, Nature.

[38]  Aristeidis G. Telonis,et al.  Race Disparities in the Contribution of miRNA Isoforms and tRNA-Derived Fragments to Triple-Negative Breast Cancer. , 2018, Cancer research.

[39]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[40]  Haedong Kim,et al.  AQ-seq: Accurate quantification of microRNAs and their variants , 2018, bioRxiv.

[41]  D. Bartel Metazoan MicroRNAs , 2018, Cell.

[42]  Kimihiro Hino,et al.  A-to-I editing in the miRNA seed region regulates target mRNA selection and silencing efficiency , 2014, Nucleic acids research.

[43]  Ryan D. Morin,et al.  Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. , 2008, Genome research.

[44]  Carlos Luzzani,et al.  Identification of the miRNAome of early mesoderm progenitor cells and cardiomyocytes derived from human pluripotent stem cells , 2018, Scientific Reports.

[45]  K Eilbeck,et al.  miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants. , 2015, Trends in genetics : TIG.

[46]  Phillipe Loher,et al.  MINTmap: fast and exhaustive profiling of nuclear and mitochondrial tRNA fragments from short RNA-seq data , 2017, Scientific Reports.

[47]  P. Provost,et al.  Protein interactions and complexes in human microRNA biogenesis and function. , 2008, Frontiers in bioscience : a journal and virtual library.

[48]  Xiaodong Zhao,et al.  A two-miRNA signature (miR-33a-5p and miR-128-3p) in whole blood as potential biomarker for early diagnosis of lung cancer , 2018, Scientific Reports.

[49]  Ali M. Ardekani,et al.  The Role of MicroRNAs in Human Diseases , 2010, Avicenna journal of medical biotechnology.

[50]  Phillipe Loher,et al.  IsomiR expression profiles in human lymphoblastoid cell lines exhibit population and gender dependencies , 2014, Oncotarget.

[51]  Rogan Magee,et al.  Knowledge about the presence or absence of miRNA isoforms (isomiRs) can successfully discriminate amongst 32 TCGA cancer types , 2017, Nucleic acids research.

[52]  M. Menezes,et al.  3′ RNA Uridylation in Epitranscriptomics, Gene Regulation, and Disease , 2018, Front. Mol. Biosci..

[53]  Obi L. Griffith,et al.  ORegAnno 3.0: a community-driven resource for curated regulatory annotation , 2015, Nucleic Acids Res..

[54]  Christina Backes,et al.  miRCarta: a central repository for collecting miRNA candidates , 2017, Nucleic Acids Res..

[55]  Isidore Rigoutsos,et al.  MiR-103a-3p targets the 5′ UTR of GPRC5A in pancreatic cells , 2014, RNA.

[56]  Hua Zhao,et al.  A 5-MicroRNA Signature Identified from Serum MicroRNA Profiling Predicts Survival in Patients with Advanced Stage Non-Small Cell Lung Cancer. , 2018, Carcinogenesis.