BISR-RNAseq: an efficient and scalable RNAseq analysis workflow with interactive report generation

RNA sequencing has become an increasingly affordable way to profile gene expression patterns. Here we introduce a workflow implementing several open-source softwares that can be run on a high performance computing environment. Developed as a tool by the Bioinformatics Shared Resource Group (BISR) at the Ohio State University, we have applied the pipeline to a few publicly available RNAseq datasets downloaded from GEO in order to demonstrate the feasibility of this workflow. Source code is available here: workflow: https://code.bmi.osumc.edu/gadepalli.3/BISR-RNAseq-ICIBM2019 and shiny: https://code.bmi.osumc.edu/gadepalli.3/BISR_RNASeq_ICIBM19. Example dataset is demonstrated here: https://dataportal.bmi.osumc.edu/RNA_Seq/. The workflow allows for the analysis (alignment, QC, gene-wise counts generation) of raw RNAseq data and seamless integration of quality analysis and differential expression results into a configurable R shiny web application.

[1]  Saurabh Baheti,et al.  MAP-RSeq: Mayo Analysis Pipeline for RNA sequencing , 2014, BMC Bioinformatics.

[2]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[3]  Andreas Heger,et al.  Next-generation Sequencing of Advanced Prostate Cancer Treated with Androgen-deprivation Therapy , 2014, European urology.

[4]  Jj Allaire,et al.  Web Application Framework for R , 2016 .

[5]  R. Mendes R: The R Project for Statistical Computing , 2016 .

[6]  Jie Quan,et al.  QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization , 2015, BMC Genomics.

[7]  Wei Li,et al.  RSeQC: quality control of RNA-seq experiments , 2012, Bioinform..

[8]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[9]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[10]  Måns Magnusson,et al.  MultiQC: summarize analysis results for multiple tools and samples in a single report , 2016, Bioinform..

[11]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[12]  Peter Frommolt,et al.  QuickNGS elevates Next-Generation Sequencing data analysis to a new level of automation , 2015, BMC Genomics.

[13]  Astrid Gall,et al.  Ensembl 2018 , 2017, Nucleic Acids Res..

[14]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[15]  Bo Li,et al.  VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis , 2018, BMC Bioinformatics.