DIEGO: detection of differential alternative splicing using Aitchison’s geometry

Motivation: Alternative splicing is a biological process of fundamental importance in most eukaryotes. It plays a pivotal role in cell differentiation and gene regulation and has been associated with a number of different diseases. The widespread availability of RNA‐Sequencing capacities allows an ever closer investigation of differentially expressed isoforms. However, most tools for differential alternative splicing (DAS) analysis do not take split reads, i.e. the most direct evidence for a splice event, into account. Here, we present DIEGO, a compositional data analysis method able to detect DAS between two sets of RNA‐Seq samples based on split reads. Results: The python tool DIEGO works without isoform annotations and is fast enough to analyze large experiments while being robust and accurate. We provide python and perl parsers for common formats. Availability and implementation: The software is available at: www.bioinf.uni‐leipzig.de/Software/DIEGO. Contact: steve@bioinf.uni‐leipzig.de Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  D. Bates,et al.  Hallmarks of alternative splicing in cancer , 2014, Oncogene.

[2]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[3]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[4]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[5]  Roland Eils,et al.  DNA methylome analysis in Burkitt and follicular lymphomas identifies differentially methylated regions linked to somatic mutation and transcriptional control , 2015, Nature Genetics.

[6]  Lan Lin,et al.  rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data , 2014, Proceedings of the National Academy of Sciences.

[7]  Julie A. Dickerson,et al.  Comparisons of computational methods for differential alternative splicing detection using RNA-seq in plant systems , 2014, BMC Bioinformatics.

[8]  Gil Ast,et al.  Alternative splicing and disease , 2008, RNA biology.

[9]  David M Umbach,et al.  IUTA: a tool for effectively detecting differential isoform usage from RNA-Seq data , 2014, BMC Genomics.

[10]  S. Shen,et al.  The statistical analysis of compositional data , 1983 .

[11]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[12]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[13]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[14]  James C. Mullikin,et al.  Detection and visualization of differential splicing in RNA-Seq data with JunctionSeq , 2015, Nucleic acids research.

[15]  Gregory R. Grant,et al.  Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data , 2015, Bioinform..

[16]  W. Huber,et al.  Detecting differential usage of exons from RNA-seq data , 2012, Genome research.

[17]  Juan González-Vallinas,et al.  A new view of transcriptome complexity and regulation through the lens of local splicing variations , 2016, eLife.

[18]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.