Jasmine: a Java pipeline for isomiR characterization in miRNA-Seq data

Abstract Motivation The existence of complex subpopulations of miRNA isoforms, or isomiRs, is well established. While many tools exist for investigating isomiR populations, they differ in how they characterize an isomiR, making it difficult to compare results across different tools. Thus, there is a need for a more comprehensive and systematic standard for defining isomiRs. Such a standard would allow investigation of isomiR population structure in progressively more refined sub-populations, permitting the identification of more subtle changes between conditions and leading to an improved understanding of the processes that generate these differences. Results We developed Jasmine, a software tool that incorporates a hierarchal framework for characterizing isomiR populations. Jasmine is a Java application that can process raw read data in fastq/fasta format, or mapped reads in SAM format to produce a detailed characterization of isomiR populations. Thus, Jasmine can reveal structure not apparent in a standard miRNA-Seq analysis pipeline. Availability and implementation Jasmine is implemented in Java and R and freely available at bitbucket https://bitbucket.org/bipous/jasmine/src/master/. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Ryan D. Morin,et al.  Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. , 2008, Genome research.

[2]  Daniel Amsel,et al.  Evaluation of high-throughput isomiR identification tools: illuminating the early isomiRome of Tribolium castaneum , 2017, BMC Bioinformatics.

[3]  Ana Kozomara,et al.  miRBase: from microRNA sequences to function , 2018, Nucleic Acids Res..

[4]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[5]  Jhih-Rong Lin,et al.  MicroRNA expression and gene regulation drive breast cancer progression and metastasis in PyMT mice , 2016, Breast Cancer Research.

[6]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[7]  Måns Magnusson,et al.  MultiQC: summarize analysis results for multiple tools and samples in a single report , 2016, Bioinform..

[8]  Anton J. Enright,et al.  Chimira: analysis of small RNA sequencing data and microRNA modifications , 2015, Bioinform..

[9]  C. Bracken,et al.  IsomiRs--the overlooked repertoire in the dynamic microRNAome. , 2012, Trends in genetics : TIG.

[10]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[11]  Andrea Acquaviva,et al.  isomiR-SEA: an RNA-Seq analysis tool for miRNAs/isomiRs expression level profiling and miRNA-mRNA interaction sites evaluation , 2016, BMC Bioinformatics.

[12]  Simon Rayner,et al.  miRBaseMiner, a tool for investigating miRBase content , 2019, RNA biology.