Fast bootstrapping‐based estimation of confidence intervals of expression levels and differential expression from RNA‐Seq data

Summary This note presents IsoEM2 and IsoDE2, new versions with enhanced features and faster runtime of the IsoEM and IsoDE packages for expression level estimation and differential expression. IsoEM2 estimates fragments per kilobase million (FPKM) and transcript per million (TPM) levels for genes and isoforms with confidence intervals through bootstrapping, while IsoDE2 performs differential expression analysis using the bootstrap samples generated by IsoEM2. Both tools are available with a command line interface as well as a graphical user interface (GUI) through wrappers for the Galaxy platform. Availability and implementation The source code of this software suite is available at https://github.com/mandricigor/isoem2. The Galaxy wrappers are available at https://toolshed.g2.bx.psu.edu/view/saharlcc/isoem2_isode2/. Contact imandric1@student.gsu.edu or ion@engr.uconn.edu Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[2]  J. Rinn,et al.  Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs , 2010, Nature Biotechnology.

[3]  L. Pachter,et al.  Streaming fragment assignment for real-time analysis of sequencing experiments , 2012, Nature Methods.

[4]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[5]  B. Tjaden,et al.  De novo assembly of bacterial transcriptomes from RNA-seq data , 2015, Genome Biology.

[6]  Ion I. Mandoiu,et al.  Estimation of alternative splicing isoform frequencies from RNA-Seq data , 2010, Algorithms for Molecular Biology.

[7]  Antti Honkela,et al.  Identifying differentially expressed transcripts from RNA-seq data with biological variation , 2011, Bioinform..

[8]  Ion I. Măndoiu,et al.  Network-based bioinformatics analysis of spatio-temporal RNA-Seq data reveals transcriptional programs underpinning normal and aberrant retinal development , 2016, BMC Genomics.

[9]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[10]  Tao Jiang,et al.  Workshop: Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads , 2012, 2012 IEEE 2nd International Conference on Computational Advances in Bio and medical Sciences (ICCABS).

[11]  Lior Pachter,et al.  Near-optimal RNA-Seq quantification , 2015, ArXiv.

[12]  Rob Patro,et al.  Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms , 2013, Nature Biotechnology.

[13]  L. Coin,et al.  Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads , 2011, Genome Biology.

[14]  Mihaela Zavolan,et al.  Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data , 2015, Genome Biology.

[15]  Ion I Măndoiu,et al.  Bootstrap-based differential gene expression analysis for RNA-Seq data with and without replicates , 2014, BMC Genomics.

[16]  Masao Nagasaki,et al.  TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads , 2014, BMC Genomics.

[17]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[18]  Hui Jiang,et al.  Statistical Modeling of RNA-Seq Data. , 2011, Statistical science : a review journal of the Institute of Mathematical Statistics.