Overestimation of alternative splicing caused by variable probe characteristics in exon arrays

In higher eukaryotes, alternative splicing is a common mechanism for increasing transcriptome diversity. Affymetrix exon arrays were designed as a tool for monitoring the relative expression levels of hundreds of thousands of known and predicted exons with a view to detecting alternative splicing events. In this article, we have analyzed exon array data from many different human and mouse tissues and have uncovered a systematic relationship between transcript-fold change and alternative splicing as reported by the splicing index. Evidence from dilution experiments and deep sequencing suggest that this effect is of technical rather than biological origin and that it is driven by sequence features of the probes. This effect is substantial and results in a 12-fold overestimation of alternative splicing events in genes that are differentially expressed. By cross-species exon array comparison, we could further show that the systematic bias persists even across species boundaries. Failure to consider this effect in data analysis would result in the reproducible false detection of apparently conserved alternative splicing events. Finally, we have developed a software in R called COSIE (Corrected Splicing Indices for Exon arrays) that for any given set of new exon array experiments corrects for the observed bias and improves the detection of alternative splicing (available at www.fmi.ch/groups/gbioinfo).

[1]  Jacek Majewski,et al.  Genome-wide analysis of transcript isoform variation in humans , 2008, Nature Genetics.

[2]  Simon Cawley,et al.  ANOSVA: a statistical method for detecting splice variation from expression data , 2005, ISMB.

[3]  Jacek Majewski,et al.  Gene Expression and Isoform Variation Analysis using Affymetrix Exon Arrays , 2008, BMC Genomics.

[4]  A. Kornblihtt Coupling transcription and alternative splicing. , 2007, Advances in experimental medicine and biology.

[5]  Ruiqiang Li,et al.  SOAP: short oligonucleotide alignment program , 2008, Bioinform..

[6]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  K. Baggerly,et al.  Global analysis of aberrant pre-mRNA splicing in glioblastoma using exon expression arrays , 2008, BMC Genomics.

[8]  Mark D. Robinson,et al.  FIRMA: a method for detection of alternative splicing from exon array data , 2008, Bioinform..

[9]  M. J. van den Bent,et al.  Identification of differentially regulated splice variants and novel exons in glial brain tumors using exon expression arrays. , 2007, Cancer research.

[10]  Harry Zuzan,et al.  Heritability of alternative splicing in the human genome. , 2007, Genome research.

[11]  Hui Jiang,et al.  MADS: a new and improved method for analysis of differential alternative splicing by exon-tiling microarrays. , 2008, RNA.

[12]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[13]  Timothy J. Triche,et al.  Experimental Comparison and Evaluation of the Affymetrix Exon and U133Plus2 GeneChip Arrays , 2007, PloS one.

[14]  Tyson A. Clark,et al.  Discovery of tissue-specific exons using comprehensive human exon microarrays , 2007, Genome Biology.

[15]  K. Aldape,et al.  A model of molecular interactions on short oligonucleotide microarrays , 2003, Nature Biotechnology.

[16]  Tyson A. Clark,et al.  Genomewide Analysis of mRNA Processing in Yeast Using Splicing-Specific Microarrays , 2002, Science.

[17]  Michal J. Okoniewski,et al.  Comprehensive Analysis of Affymetrix Exon Arrays Using BioConductor , 2008, PLoS Comput. Biol..

[18]  Mihaela Zavolan,et al.  Computational analysis of small RNA cloning data. , 2008, Methods.

[19]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.