Splicing predictions reliably classify different types of alternative splicing

Alternative splicing is a key player in the creation of complex mammalian transcriptomes and its misregulation is associated with many human diseases. Multiple mRNA isoforms are generated from most human genes, a process mediated by the interplay of various RNA signature elements and trans-acting factors that guide spliceosomal assembly and intron removal. Here, we introduce a splicing predictor that evaluates hundreds of RNA features simultaneously to successfully differentiate between exons that are constitutively spliced, exons that undergo alternative 5' or 3' splice-site selection, and alternative cassette-type exons. Surprisingly, the splicing predictor did not feature strong discriminatory contributions from binding sites for known splicing regulators. Rather, the ability of an exon to be involved in one or multiple types of alternative splicing is dictated by its immediate sequence context, mainly driven by the identity of the exon's splice sites, the conservation around them, and its exon/intron architecture. Thus, the splicing behavior of human exons can be reliably predicted based on basic RNA sequence elements.

[1]  B. Blencowe,et al.  An RNA map predicting Nova-dependent splicing regulation , 2006, Nature.

[2]  Michael D. Wilson,et al.  The Evolutionary Landscape of Alternative Splicing in Vertebrate Species , 2012, Science.

[3]  E. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[4]  T. Cooper,et al.  Identification of Putative New Splicing Targets for ETR-3 Using Sequences Identified by Systematic Evolution of Ligands by Exponential Enrichment , 2005, Molecular and Cellular Biology.

[5]  Brendan J. Frey,et al.  Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context , 2011, Bioinform..

[6]  I. Pérez,et al.  Mutation of PTB binding sites causes misregulation of alternative 3' splice site selection in vivo. , 1997, RNA.

[7]  Weijun Gao,et al.  AVISPA: a web tool for the prediction and analysis of alternative splicing , 2013, Genome Biology.

[8]  J. Conboy,et al.  The splicing regulatory element, UGCAUG, is phylogenetically and spatially conserved in introns that flank tissue-specific alternative exons , 2005, Nucleic acids research.

[9]  T. Cooper,et al.  Pre-mRNA splicing and human disease. , 2003, Genes & development.

[10]  R. Sorek,et al.  Intronic sequences flanking alternatively spliced exons are conserved between human and mouse. , 2003, Genome research.

[11]  Kristi L Fox-Walsh,et al.  Splice-site pairing is an intrinsically high fidelity process , 2009, Proceedings of the National Academy of Sciences.

[12]  C Joel McManus,et al.  RNA structure and the mechanisms of alternative splicing. , 2011, Current opinion in genetics & development.

[13]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[14]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[15]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[16]  T. Cooper,et al.  Muscleblind proteins regulate alternative splicing , 2004, The EMBO journal.

[17]  Christina L. Zheng,et al.  Characteristics and regulatory elements defining constitutive splicing and different modes of alternative splicing in human and mouse. , 2005, RNA.

[18]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[19]  A. Krainer,et al.  Listening to silence and understanding nonsense: exonic mutations that affect splicing , 2002, Nature Reviews Genetics.

[20]  Peter J. Shepard,et al.  Conserved RNA secondary structures promote alternative splicing. , 2008, RNA.

[21]  Christopher J. Lee,et al.  Discovery of novel splice forms and functional analysis of cancer-specific alternative splicing in human expressed sequences. , 2003, Nucleic acids research.

[22]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[23]  I. Graham,et al.  Effects of RNA secondary structure on alternative splicing of Pre-mRNA: Is folding limited to a region behind the transcribing RNA polymerase? , 1988, Cell.

[24]  Brendan J. Frey,et al.  Deciphering the splicing code , 2010, Nature.

[25]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[26]  D. Black Mechanisms of alternative pre-messenger RNA splicing. , 2003, Annual review of biochemistry.

[27]  Jinhua Wang,et al.  ESEfinder: a web resource to identify exonic splicing enhancers , 2003, Nucleic Acids Res..

[28]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[29]  Joseph K. Pickrell,et al.  Noisy Splicing Drives mRNA Isoform Diversity in Human Cells , 2010, PLoS genetics.

[30]  D. Cooper,et al.  The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: Causes and consequences , 1992, Human Genetics.

[31]  Anke Busch,et al.  Efficient internal exon recognition depends on near equal contributions from the 3′ and 5′ splice sites , 2011, Nucleic acids research.

[32]  Anke Busch,et al.  HEXEvent: a database of Human EXon splicing Events , 2012, Nucleic Acids Res..

[33]  A. Kornblihtt,et al.  Promoter usage and alternative splicing. , 2005, Current opinion in cell biology.

[34]  T A Thanaraj,et al.  Categorization and characterization of transcript-confirmed constitutively and alternatively spliced introns and exons from human. , 2002, Human molecular genetics.

[35]  Rolf Backofen,et al.  Pre-mRNA Secondary Structures Influence Exon Recognition , 2007, PLoS genetics.

[36]  P. Baldi,et al.  The architecture of pre-mRNAs affects mechanisms of splice-site pairing. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[37]  D. Black,et al.  Structure of PTB Bound to RNA: Specific Binding and Implications for Splicing Regulation , 2005, Science.

[38]  B. Brinkman,et al.  Splice variants as cancer biomarkers. , 2004, Clinical biochemistry.

[39]  Michael Ruogu Zhang,et al.  An alternative-exon database and its statistical analysis. , 2000, DNA and cell biology.

[40]  C. Bortner,et al.  Modification of Alternative Splicing of Bcl-x Pre-mRNA in Prostate and Breast Cancer Cells , 2001, The Journal of Biological Chemistry.

[41]  S. Richard,et al.  Target RNA motif and target mRNAs of the Quaking STAR protein , 2005, Nature Structural &Molecular Biology.

[42]  Liang Chen,et al.  Identify Alternative Splicing Events Based on Position-Specific Evolutionary Conservation , 2008, PloS one.

[43]  E. Brody,et al.  RNA secondary structure repression of a muscle-specific exon in HeLa cell nuclear extracts. , 1991, Science.

[44]  S. Berget Exon Recognition in Vertebrate Splicing (*) , 1995, The Journal of Biological Chemistry.

[45]  B. Frey,et al.  A systematic analysis of intronic sequences downstream of 5' splice sites reveals a widespread role for U-rich motifs and TIA1/TIAL1 proteins in alternative splicing regulation. , 2008, Genome research.

[46]  K. Buetow,et al.  Computational analysis and experimental validation of tumor-associated alternative RNA splicing in human cancer. , 2003, Cancer research.

[47]  Peter J. Shepard,et al.  Embracing the complexity of pre-mRNA splicing , 2010, Cell Research.

[48]  M. Ashiya,et al.  A neuron-specific splicing switch mediated by an array of pre-mRNA repressor sites: evidence of a regulatory role for the polypyrimidine tract binding protein and a brain-specific PTB counterpart. , 1997, RNA.

[49]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[50]  Ron Shamir,et al.  A non-EST-based method for exon-skipping prediction. , 2004, Genome research.

[51]  Russ P Carstens,et al.  Alternative splicing of fibroblast growth factor receptor 2 (FGF-R2) in human prostate cancer , 1997, Oncogene.

[52]  International Human Genome Sequencing Consortium Finishing the euchromatic sequence of the human genome , 2004 .

[53]  Ron Shamir,et al.  Accurate identification of alternatively spliced exons using support vector machine , 2005, Bioinform..

[54]  H. Taubert,et al.  MDM2 and its splice variant messenger RNAs: expression in tumors and down-regulation using antisense oligonucleotides. , 2004, Molecular cancer research : MCR.