spliceR: an R package for classification of alternative splicing and prediction of coding potential from RNA-seq data

Background: RNA-seq data is currently underutilized, in part because it is difficult to predict the functional impact of alternate transcription events. Recent software improvements in full-length transcript deconvolution prompted us to develop spliceR, an R package for classification of alternative splicing and prediction of coding potential. Results: spliceR uses the full-length transcript output from RNA-seq assemblers to detect single or multiple exon skipping, alternative donor and acceptor sites, intron retention, alternative first or last exon usage, and mutually exclusive exon events. For each of these events spliceR also annotates the genomic coordinates of the differentially spliced elements, facilitating downstream sequence analysis. For each transcript isoform fraction values are calculated to identify transcript switching between conditions. Lastly, spliceR predicts the coding potential, as well as the potential nonsense mediated decay (NMD) sensitivity of each transcript. Conclusions: spliceR is an easy-to-use tool that extends the usability of RNA-seq and assembly technologies by allowing greater depth of annotation of RNA-seq data. spliceR is implemented as an R package and is freely available from the Bioconductor repository (http://www.bioconductor.org/packages/2.13/bioc/html/spliceR.html).

[1]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[2]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[3]  Michael D. Wilson,et al.  The Evolutionary Landscape of Alternative Splicing in Vertebrate Species , 2012, Science.

[4]  E. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[5]  John N. Weinstein,et al.  SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts , 2012, Bioinform..

[6]  A. Ben-Hur,et al.  METHOD Open Access , 2014 .

[7]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[8]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[9]  Fangqing Zhao,et al.  Detection, annotation and visualization of alternative splicing from RNA-Seq data with SplicingViewer. , 2012, Genomics.

[10]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[11]  Juw Won Park,et al.  MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data , 2012, Nucleic acids research.

[12]  Sylvain Foissac,et al.  ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets , 2007, Nucleic Acids Res..

[13]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[14]  Anders Krogh,et al.  Mammalian tissues defective in nonsense-mediated mRNA decay display highly aberrant splicing patterns , 2012, Genome Biology.

[15]  Albin Sandelin,et al.  spliceR: an R package for classification of alternative splicing and prediction of coding potential from RNA-seq data , 2014, BMC Bioinformatics.

[16]  J. Kocher,et al.  CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model , 2013, Nucleic acids research.

[17]  Yunlong Liu,et al.  Alt Event Finder: a tool for extracting alternative splicing events from RNA-seq data , 2012, BMC Genomics.

[18]  BMC Bioinformatics , 2005 .

[19]  L. Maquat,et al.  Mechanistic links between nonsense-mediated mRNA decay and pre-mRNA splicing in mammalian cells. , 2005, Current opinion in cell biology.

[20]  Marie-France Sagot,et al.  Theme: Computational Biology and Bioinformatics Computational Sciences for Biology, Medicine and the Environment , 2012 .

[21]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[22]  Amanda E. Jones,et al.  USP49 deubiquitinates histone H2B and regulates cotranscriptional pre-mRNA splicing. , 2013, Genes & development.

[23]  Derek Y. Chiang,et al.  DiffSplice: the genome-wide detection of differential splicing events with RNA-seq , 2012, Nucleic acids research.