IUTA: a tool for effectively detecting differential isoform usage from RNA-Seq data

BackgroundMost genes in mammals generate several transcript isoforms that differ in stability and translational efficiency through alternative splicing. Such alternative splicing can be tissue- and developmental stage-specific, and such specificity is sometimes associated with disease. Thus, detecting differential isoform usage for a gene between tissues or cell lines/types (differences in the fraction of total expression of a gene represented by the expression of each of its isoforms) is potentially important for cell and developmental biology.ResultsWe present a new method IUTA that is designed to test each gene in the genome for differential isoform usage between two groups of samples. IUTA also estimates isoform usage for each gene in each sample as well as averaged across samples within each group. IUTA is the first method to formulate the testing problem as testing for equal means of two probability distributions under the Aitchison geometry, which is widely recognized as the most appropriate geometry for compositional data (vectors that contain the relative amount of each component comprising the whole). Evaluation using simulated data showed that IUTA was able to provide test results for many more genes than was Cuffdiff2 (version 2.2.0, released in Mar. 2014), and IUTA performed better than Cuffdiff2 for the limited number of genes that Cuffdiff2 did analyze. When applied to actual mouse RNA-Seq datasets from six tissues, IUTA identified 2,073 significant genes with clear patterns of differential isoform usage between a pair of tissues. IUTA is implemented as an R package and is available at http://www.niehs.nih.gov/research/resources/software/biostatistics/iuta/index.cfm.ConclusionsBoth simulation and real-data results suggest that IUTA accurately detects differential isoform usage. We believe that our analysis of RNA-seq data from six mouse tissues represents the first comprehensive characterization of isoform usage in these tissues. IUTA will be a valuable resource for those who study the roles of alternative transcripts in cell development and disease.

[1]  Colin N. Dewey,et al.  Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs , 2013, Bioinform..

[2]  Raimon Tolosana Delgado,et al.  Lecture Notes on Compositional Data Analysis , 2007 .

[3]  K. Krishnamoorthy,et al.  Modified Nel and Van der Merwe test for the multivariate Behrens–Fisher problem , 2004 .

[4]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[5]  R. Skotheim,et al.  Alternative splicing in cancer: noise, functional, or systematic? , 2007, The international journal of biochemistry & cell biology.

[6]  L. Coin,et al.  Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads , 2011, Genome Biology.

[7]  G. Mateu-Figueras,et al.  Isometric Logratio Transformations for Compositional Data Analysis , 2003 .

[8]  Bin Tian,et al.  Comparative Analysis of mRNA Isoform Expression in Cardiac Hypertrophy and Development Reveals Multiple Post-Transcriptional Regulatory Modules , 2011, PloS one.

[9]  B. Blencowe Alternative Splicing: New Insights from Global Analyses , 2006, Cell.

[10]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[11]  M. Alló,et al.  Alternative splicing: a pivotal step between eukaryotic transcription and translation , 2013, Nature Reviews Molecular Cell Biology.

[12]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[13]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[14]  F. Clark,et al.  Understanding alternative splicing: towards a cellular code , 2005, Nature Reviews Molecular Cell Biology.

[15]  D. Rice,et al.  Fgfr mRNA isoforms in craniofacial bone development. , 2003, Bone.

[16]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[17]  M. Schachner,et al.  Tenascin mRNA isoforms in the developing mouse brain , 1994, Journal of neuroscience research.

[18]  Muni S. Srivastava,et al.  A two sample test in high dimensional data , 2013, Journal of Multivariate Analysis.

[19]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[20]  C Joel McManus,et al.  RNA structure and the mechanisms of alternative splicing. , 2011, Current opinion in genetics & development.

[21]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[22]  T. Nilsen,et al.  Expansion of the eukaryotic proteome by alternative splicing , 2010, Nature.

[23]  E. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[24]  Ning Leng,et al.  EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments , 2013, Bioinform..

[25]  T. Cooper,et al.  Pre-mRNA splicing and human disease. , 2003, Genes & development.

[26]  Brian E. Howard,et al.  Towards reliable isoform quantification using RNA-SEQ data , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.

[27]  Antti Honkela,et al.  Identifying differentially expressed transcripts from RNA-seq data with biological variation , 2011, Bioinform..

[28]  A. Mccarthy Development , 1996, Current Opinion in Neurobiology.

[29]  J. Troncoso,et al.  Overexpression of four‐repeat tau mRNA isoforms in progressive supranuclear palsy but not in Alzheimer's disease , 1999, Annals of neurology.

[30]  C. Burge,et al.  Evolutionary Dynamics of Gene and Isoform Regulation in Mammalian Tissues , 2012, Science.

[31]  David G Hendrickson,et al.  Differential analysis of gene regulation at transcript resolution with RNA-seq , 2012, Nature Biotechnology.

[32]  Colin S. Maxwell,et al.  Nutritional control of mRNA isoform expression during developmental arrest and recovery in C. elegans , 2012, Genome research.

[33]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[34]  Song-xi Chen,et al.  A two-sample test for high-dimensional data with applications to gene-set testing , 2010, 1002.4547.

[35]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.

[36]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[37]  C. Ding,et al.  MONaKA, a Novel Modulator of the Plasma Membrane Na,K-ATPase , 2005, The Journal of Neuroscience.

[38]  D. Black Mechanisms of alternative pre-messenger RNA splicing. , 2003, Annual review of biochemistry.

[39]  Gunnar Rätsch,et al.  Accurate detection of differential RNA processing , 2013, Nucleic acids research.

[40]  Haixu Tang,et al.  Splicing graphs and EST assembly problem , 2002, ISMB.

[41]  Vera Pawlowsky-Glahn Statistical modeling on coordinates , 2003 .

[42]  V. Pawlowsky-Glahn,et al.  Geometric approach to statistical analysis on the simplex , 2001 .

[43]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[44]  M. Sunday,et al.  Differential expression of VEGF isoforms in mouse during development and in the adult , 2001, Developmental dynamics : an official publication of the American Association of Anatomists.

[45]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[46]  Yufeng Liu,et al.  FDM: a graph-based statistical method to detect differential transcription using RNA-seq data , 2011, Bioinform..

[47]  Derek Y. Chiang,et al.  DiffSplice: the genome-wide detection of differential splicing events with RNA-seq , 2012, Nucleic acids research.

[48]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.