Detecting various types of differential splicing events using RNA-Seq data

More than 90% of human genes are alternatively spliced through different types of splicing. The high-throughput RNA-Seq technology provides unprecedented opportunities for detection of differential pre-mRNA alternative splicing between different transcriptomes. Besides differential expression analysis, differential splicing analysis may generate new understanding into cell development and differentiation as well as various human diseases. In this paper, we present a novel computational method for detecting types of differential splicing events between transcriptomes using RNA-Seq data. Our method utilizes sequential dependency of base-wise read coverage signals and detects significant differential splicing events in the form of five types of splicing events supported by junction reads. For each candidate splicing event, by taking ratio of normalized RNA-Seq splicing indexes at each nucleotide location of two samples, our method reduces the effect of sequencing and alignment biases. We employ a parametric statistical test and a change-point type of analysis on each candidate splicing event for differential splicing event detection. We applied our method on a public RNA-Seq data set of human H1 and H1 differentiation into neural progenitor cell lines and detected many significant differential splicing events falling into the five well-known types of alternative splicing. We also compared our method with the other two existing methods, and the results demonstrate that our method is a promising approach, which can uniquely detect more differential splicing events using RNA-Seq data.

[1]  Arjun K. Gupta,et al.  Parametric Statistical Change Point Analysis , 2000 .

[2]  Fred H. Gage,et al.  Alternative Splicing Events Identified in Human Embryonic Stem Cells and Neural Progenitors , 2007, PLoS Comput. Biol..

[3]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Derek Y. Chiang,et al.  DiffSplice: the genome-wide detection of differential splicing events with RNA-seq , 2012, Nucleic acids research.

[6]  Guey-Shin Wang,et al.  Splicing in disease: disruption of the splicing code and the decoding machinery , 2007, Nature Reviews Genetics.

[7]  Yufeng Liu,et al.  FDM: a graph-based statistical method to detect differential transcription using RNA-seq data , 2011, Bioinform..

[8]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[9]  Shihao Shen,et al.  MADS+: discovery of differential splicing events from Affymetrix exon junction array data , 2009, Bioinform..

[10]  T. Godfrey,et al.  Whole genome exon arrays identify differential expression of alternatively spliced, cancer-related genes in lung cancer , 2008, Nucleic acids research.

[11]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[12]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[13]  E. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[14]  Steven J. M. Jones,et al.  Alternative expression analysis by RNA sequencing , 2010, Nature Methods.

[15]  Nan Deng,et al.  Detecting Splicing Variants in Idiopathic Pulmonary Fibrosis from Non-Differentially Expressed Genes , 2013, PloS one.

[16]  Aaron R. Quinlan,et al.  BamTools: a C++ API and toolkit for analyzing and managing BAM files , 2011, Bioinform..

[17]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[18]  W. Huber,et al.  Detecting differential usage of exons from RNA-seq data , 2012, Genome research.

[19]  Yu Zhu,et al.  Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq , 2012, Bioinform..

[20]  Jie Wu,et al.  SpliceTrap: a method to quantify alternative splicing under single cellular conditions , 2011, Bioinform..

[21]  Gunnar Rätsch,et al.  Statistical Tests for Detecting Differential RNA-Transcript Expression from Read Counts , 2010, ISMB 2011.

[22]  N. Deng,et al.  Isoform-level microRNA-155 target prediction using RNA-seq , 2011, Nucleic acids research.

[23]  Harry Zuzan,et al.  Heritability of alternative splicing in the human genome. , 2007, Genome research.

[24]  Jie Chen,et al.  A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data , 2009, IEEE ACM Trans. Comput. Biol. Bioinform..

[25]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[26]  Lili Wan,et al.  RNA and Disease , 2009, Cell.

[27]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[28]  Roland Eils,et al.  SplicingCompass: differential splicing detection using RNA-Seq data , 2013, Bioinform..

[29]  Tadashi Kondo,et al.  Alternative splice variant of actinin-4 in small cell lung cancer , 2004, Oncogene.

[30]  Arjun K. Gupta,et al.  Testing and Locating Variance Changepoints with Application to Stock Prices , 1997 .

[31]  Eric R. Ziegel Statistics for Petroleum Engineers and Geoscientists , 1999, Technometrics.

[32]  Juw Won Park,et al.  MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data , 2012, Nucleic acids research.

[33]  G. Ast,et al.  Alternative splicing and evolution: diversification, exon definition and function , 2010, Nature Reviews Genetics.

[34]  Hui Jiang,et al.  MADS: a new and improved method for analysis of differential alternative splicing by exon-tiling microarrays. , 2008, RNA.

[35]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[36]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[37]  Wing Hung Wong,et al.  Statistical inferences for isoform expression in RNA-Seq , 2009, Bioinform..

[38]  J. Venables Aberrant and Alternative Splicing in Cancer , 2004, Cancer Research.

[39]  Tyson A. Clark,et al.  Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array , 2006, BMC Genomics.