Alternative splicing detection workflow needs a careful combination of sample prep and bioinformatics analysis

Background RNAseq provides remarkable power in the area of biomarkers discovery and disease stratification. The main technical steps affecting the results of RNAseq experiments are Library Sample Preparation (LSP) and Bioinformatics Analysis (BA). At the best of our knowledge, a comparative evaluation of the combined effect of LSP and BA was never considered and it might represent a valuable knowledge to optimize alternative splicing detection, which is a challenging task due to moderate fold change differences to be detected within a complex isoforms background. Results Different LSPs (TruSeq unstranded/stranded, ScriptSeq, NuGEN) allow the detection of a large common set of isoforms. However, each LSP also detects a smaller set of isoforms, which are characterized both by lower coverage and lower FPKM than that observed for the common ones among LSPs. This characteristic is particularly critical in case of low input RNA NuGEN v2 LSP. The effect on statistical detection of alternative splicing considering low input LSP (NuGEN v2) with respect to high input LSP (TruSeq) on statistical detection of alternative splicing was studied using a benchmark dataset, in which both synthetic reads and reads generated from high (TruSeq) and low input (NuGEN) LSPs were spiked-in. Statistical detection of alternative splicing (AltDE) was done using prototypes of BA for isoform-reconstruction (Cuffdiff) and exon-level analysis (DEXSeq). Exon-level analysis performs slightly better than isoform-reconstruction approach although at most only 50% of the spiked-in transcripts are detected. Both isoform-reconstruction and exon-level analysis performances improve by rising the number of input reads. Conclusion Data, derived from NuGEN v2, are not the ideal input for AltDE, specifically when exon-level approach is used. It is notable that ribosomal depletion, with respect to polyA+ selection, reduces the amount of coding mappable reads resulting detrimental in the case of AltDE. Furthermore, we observed that both isoform-reconstruction and exon-level analysis performances are strongly dependent on the number of input reads.

[1]  David G Hendrickson,et al.  Differential analysis of gene regulation at transcript resolution with RNA-seq , 2012, Nature Biotechnology.

[2]  Eyras Eduardo,et al.  Methods to Study Splicing from RNA-Seq , 2013 .

[3]  Fatih Ozsolak,et al.  RNA sequencing: advances, challenges and opportunities , 2011, Nature Reviews Genetics.

[4]  S. Linnarsson,et al.  Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. , 2011, Genome research.

[5]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[6]  T. Hashimshony,et al.  CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. , 2012, Cell reports.

[7]  Åsa K. Björklund,et al.  Full-length RNA-seq from single cells using Smart-seq2 , 2014, Nature Protocols.

[8]  Jennifer M. Bolin,et al.  Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA , 2011, Journal of visualized experiments : JoVE.

[9]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[10]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[11]  W. Huber,et al.  Detecting differential usage of exons from RNA-seq data , 2012, Genome research.

[12]  Krishna R. Kalari,et al.  Impact of Library Preparation on Downstream Analysis and Interpretation of RNA-Seq Data: Comparison between Illumina PolyA and NuGEN Ovation Protocol , 2013, PloS one.

[13]  Shenfeng Qiu,et al.  Single-neuron RNA-Seq: technical feasibility and reproducibility , 2012, Front. Gene..

[14]  W. Shi,et al.  The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote , 2013, Nucleic acids research.

[15]  C. Thermes,et al.  Library preparation methods for next-generation sequencing: tone down the bias. , 2014, Experimental cell research.

[16]  Francesca Granucci,et al.  Maturation Stages of Mouse Dendritic Cells in Growth Factor–dependent Long-Term Cultures , 1997, The Journal of experimental medicine.