Rules and tools to predict the splicing effects of exonic and intronic mutations

Development of next generation sequencing technologies has enabled detection of extensive arrays of germline and somatic single nucleotide variations (SNVs) in human diseases. SNVs affecting intronic GT‐AG dinucleotides invariably compromise pre‐mRNA splicing. Most exonic SNVs introduce missense/nonsense codons, but some affect auxiliary splicing cis‐elements or generate cryptic GT‐AG dinucleotides. Similarly, most intronic SNVs are silent, but some affect canonical and auxiliary splicing cis‐elements or generate cryptic GT‐AG dinucleotides. However, prediction of the splicing effects of SNVs is challenging. The splicing effects of SNVs generating cryptic AG or disrupting canonical AG can be inferred from the AG‐scanning model. Similarly, the splicing effects of SNVs affecting the first nucleotide G of an exon can be inferred from AG‐dependence of the 3′ splice site (ss). A variety of tools have been developed for predicting the splicing effects of SNVs affecting the 5′ ss, as well as exonic and intronic splicing enhancers/silencers. In contrast, only two tools, the Human Splicing Finder and the SVM‐BP finder, are available for predicting the position of the branch point sequence. Similarly, IntSplice and Splicing based Analysis of Variants (SPANR) are the only tools to predict the splicing effects of intronic SNVs. The rules and tools introduced in this review are mostly based on observations of a limited number of genes, and no rule or tool can ensure 100% accuracy. Experimental validation is always required before any clinically relevant conclusions are drawn. Development of efficient tools to predict aberrant splicing, however, will facilitate our understanding of splicing pathomechanisms in human diseases. WIREs RNA 2018, 9:e1451. doi: 10.1002/wrna.1451

[1]  R. Reed,et al.  An Upstream AG Determines Whether a Downstream AG Is Selected during Catalytic Step II of Splicing , 2001, Molecular and Cellular Biology.

[2]  Christopher B. Burge,et al.  RESCUE-ESE identifies candidate exonic splicing enhancers in vertebrate exons , 2004, Nucleic Acids Res..

[3]  L. Shkreta,et al.  hnRNP proteins and splicing control. , 2007, Advances in experimental medicine and biology.

[4]  Brendan J. Frey,et al.  Deciphering the splicing code , 2010, Nature.

[5]  J. Manley,et al.  The end of the message: multiple protein–RNA interactions define the mRNA polyadenylation site , 2015, Genes & development.

[6]  Stephen M. Mount,et al.  Are snRNPs involved in splicing? , 1980, Nature.

[7]  B. Wieringa,et al.  A minimal intron length but no specific internal sequence is required for splicing the large rabbit β-globin intron , 1984, Cell.

[8]  R Staden Computer methods to locate signals in nucleic acid sequences , 1984, Nucleic Acids Res..

[9]  G. Yehia,et al.  Analysis of alterative cleavage and polyadenylation by 3′ region extraction and deep sequencing , 2012, Nature Methods.

[10]  F. Piva,et al.  SpliceAid 2: A database of human splicing factors expression data and RNA target motifs , 2012, Human mutation.

[11]  J. Azizkhan,et al.  Characterization of the rat transforming growth factor alpha gene and identification of promoter sequences , 1990, Molecular and cellular biology.

[12]  Ravi Sachidanandam,et al.  Determinants of the inherent strength of human 5' splice sites. , 2005, RNA.

[13]  R. Amann,et al.  Predictive Identification of Exonic Splicing Enhancers in Human Genes , 2022 .

[14]  D. Quon,et al.  Two mutations produce intron insertion in mRNA and elongated beta-subunit of human beta-hexosaminidase. , 1990, The Journal of biological chemistry.

[15]  T. Tatusova,et al.  Cryptic splice sites and split genes , 2011, Nucleic acids research.

[16]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[17]  Wei Zhu,et al.  Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus , 2004, Bioinform..

[18]  K. Lynch,et al.  Regulation of Alternative Splicing: More than Just the ABCs* , 2008, Journal of Biological Chemistry.

[19]  Jinhua Wang,et al.  ESEfinder: a web resource to identify exonic splicing enhancers , 2003, Nucleic Acids Res..

[20]  Michael Ruogu Zhang,et al.  Statistical features of human exons and their flanking regions. , 1998, Human molecular genetics.

[21]  K. Ohno,et al.  Splicing aberrations in congenital myasthenic syndromes , 2015 .

[22]  Eric Boerwinkle,et al.  In silico prediction of splice-altering single nucleotide variants in the human genome , 2014, Nucleic acids research.

[23]  C. Béroud,et al.  Human Splicing Finder: an online bioinformatics tool to predict splicing signals , 2009, Nucleic acids research.

[24]  Steven Salzberg,et al.  A method for identifying splice sites and translational start sites in eukaryotic mRNA , 1997, Comput. Appl. Biosci..

[25]  Schraga Schwartz,et al.  SROOGLE: webserver for integrative, user-friendly visualization of splicing signals , 2009, Nucleic Acids Res..

[26]  K. Ohno,et al.  Congenital end-plate acetylcholinesterase deficiency caused by a nonsense mutation and an A-->G splice-donor-site mutation at position +3 of the collagenlike-tail-subunit gene (COLQ): how does G at position +3 result in aberrant splicing? , 1999, American journal of human genetics.

[27]  J. Valcárcel,et al.  hnRNP A1 proofreads 3' splice site recognition by U2AF. , 2012, Molecular cell.

[28]  Wilfried Haerty,et al.  Genome-wide discovery of human splicing branchpoints , 2015, Genome research.

[29]  Lise Getoor,et al.  SplicePort—An interactive splice-site analysis tool , 2007, Nucleic Acids Res..

[30]  R. Reed,et al.  The RNA splicing factor hSlu7 is required for correct 3′ splice-site choice , 1999, Nature.

[31]  A. Masuda,et al.  hnRNP H enhances skipping of a nonfunctional exon P3A in CHRNA1 and a mutation disrupting its binding causes congenital myasthenic syndrome. , 2008, Human molecular genetics.

[32]  D. Gallwitz,et al.  Evidence for an intron-contained sequence required for the splicing of yeast RNA polymerase II transcripts , 1983, Cell.

[33]  J. Valcárcel,et al.  Inhibition of msl-2 splicing by Sex-lethal reveals interaction between U2AF35 and the 3′ splice site AG , 1999, Nature.

[34]  L. Chasin,et al.  Computational definition of sequence motifs governing constitutive exon splicing. , 2004, Genes & development.

[35]  Benjamin J. Raphael,et al.  Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes , 2011, Proceedings of the National Academy of Sciences.

[36]  Daryi Wang,et al.  Identification of activated cryptic 5′ splice sites using structure profiles and odds measure , 2012, Nucleic acids research.

[37]  Danny A. Bitton,et al.  LaSSO, a strategy for genome-wide mapping of intronic lariats and branch points using RNA-seq , 2014, Genome research.

[38]  Michael R Green,et al.  A pathway of sequential arginine-serine-rich domain-splicing signal interactions during mammalian spliceosome assembly. , 2004, Molecular cell.

[39]  J. Valcárcel,et al.  Evidence for Substrate-Specific Requirement of the Splicing Factor U2AF35 and for Its Function after Polypyrimidine Tract Recognition by U2AF65 , 1999, Molecular and Cellular Biology.

[40]  Weijun Gao,et al.  AVISPA: a web tool for the prediction and analysis of alternative splicing , 2013, Genome Biology.

[41]  Christopher W. J. Smith,et al.  Genome-Wide Association between Branch Point Properties and Alternative Splicing , 2010, PLoS Comput. Biol..

[42]  C. Amos,et al.  Missense mutations in hMLH1 and hMSH2 are associated with exonic splicing enhancers. , 2003, American journal of human genetics.

[43]  S. Salzberg,et al.  GeneSplicer: a new computational method for splice site prediction. , 2001, Nucleic acids research.

[44]  B. Graveley Sorting out the complexity of SR protein functions. , 2000, RNA.

[45]  Allison J. Taggart,et al.  Large-scale mapping of branchpoints in human pre-mRNA transcripts in vivo , 2012, Nature Structural &Molecular Biology.

[46]  K. Ohno,et al.  C-terminal and Heparin-binding Domains of Collagenic Tail Subunit Are Both Essential for Anchoring Acetylcholinesterase at the Synapse* , 2004, Journal of Biological Chemistry.

[47]  J. Mullikin,et al.  Genomic features defining exonic variants that modulate splicing , 2010, Genome Biology.

[48]  M. Lewandowska The missing puzzle piece: splicing mutations. , 2013, International journal of clinical and experimental pathology.

[49]  Gregory M Lee,et al.  W474C amino acid substitution affects early processing of the α‐subunit of β‐hexosaminidase A and is associated with subacute GM2 gangliosidosis , 1998 .

[50]  Xiang-Dong Fu,et al.  SR proteins and related factors in alternative splicing. , 2007, Advances in experimental medicine and biology.

[51]  S. Knudsen,et al.  Prediction of human mRNA donor and acceptor sites from the DNA sequence. , 1991, Journal of molecular biology.

[52]  J. G. Patton,et al.  Functional analysis of the polypyrimidine tract in pre-mRNA splicing. , 1997, Nucleic acids research.

[53]  Marvin B. Shapiro,et al.  RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. , 1987, Nucleic acids research.

[54]  Thilo Dörk,et al.  Nonclassical splicing mutations in the coding and noncoding regions of the ATM Gene: Maximum entropy estimates of splice junction strengths , 2004, Human mutation.

[55]  K. Malavade,et al.  Systematic screening for RNA with skipped exons--splicing mutations of the ferrochelatase gene. , 1995, Biochimica et biophysica acta.

[56]  T. Okuyama,et al.  EYA1 and SIX1 gene mutations in Japanese patients with branchio-oto-renal (BOR) syndrome and related conditions , 2006, Pediatric Nephrology.

[57]  A. Masuda,et al.  SRSF1 and hnRNP H antagonistically regulate splicing of COLQ exon 16 in a congenital myasthenic syndrome , 2015, Scientific Reports.

[58]  Thaned Kangsamaksin,et al.  Exon Inclusion Is Dependent on Predictable Exonic Splicing Enhancers , 2005, Molecular and Cellular Biology.

[59]  J. Tazi,et al.  Exon definition complexes contain the tri-snRNP and can be directly converted into B-like precatalytic splicing complexes. , 2010, Molecular cell.

[60]  A. Masuda,et al.  HnRNP L and hnRNP LL antagonistically modulate PTB-mediated splicing suppression of CHRNA1 pre-mRNA , 2013, Scientific Reports.

[61]  J. Valcárcel,et al.  Intron Removal Requires Proofreading of U2AF/3' Splice Site Recognition by DEK , 2006, Science.

[62]  Eric Boerwinkle,et al.  In silico tools for splicing defect prediction - A survey from the viewpoint of end-users , 2013, Genetics in Medicine.

[63]  H. Mandel,et al.  Profound biotinidase deficiency caused by a point mutation that creates a downstream cryptic 3' splice acceptor site within an exon of the human biotinidase gene. , 1997, Human molecular genetics.

[64]  Peter K Rogan,et al.  Automated splicing mutation analysis by information theory , 2005, Human mutation.

[65]  Jean-Philippe Vert,et al.  Guidelines for splicing analysis in molecular diagnosis derived from a set of 327 combined in silico/in vitro studies on BRCA1 and BRCA2 variants , 2012, Human mutation.

[66]  H. Blanché,et al.  NF1 Molecular Characterization and Neurofibromatosis Type I Genotype–Phenotype Correlation: The French Experience , 2013, Human mutation.

[67]  C. Will,et al.  The Spliceosome: Design Principles of a Dynamic RNP Machine , 2009, Cell.

[68]  Christopher W. J. Smith,et al.  Scanning and competition between AGs are involved in 3' splice site selection in mammalian introns , 1993, Molecular and cellular biology.

[69]  Michael R. Green,et al.  Arginine-serine-rich domains bound at splicing enhancers contact the branchpoint to promote prespliceosome assembly. , 2004, Molecular cell.

[70]  Karl-Heinz Glatting,et al.  Genome-wide prediction of splice-modifying SNPs in human genes using a new analysis pipeline called AASsites , 2011, BMC Bioinformatics.

[71]  D. Black Mechanisms of alternative pre-messenger RNA splicing. , 2003, Annual review of biochemistry.

[72]  G. Ast,et al.  Comparative analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers and silencers. , 2006, Molecular cell.

[73]  D. Cooper,et al.  Loss of exon identity is a common mechanism of human inherited disease. , 2011, Genome research.

[74]  Reinhard Jahn,et al.  Helical extension of the neuronal SNARE complex into the membrane , 2009, Nature.

[75]  Matthew R. Gazzara,et al.  In silico to in vivo splicing analysis using splicing code models. , 2014, Methods.

[76]  P. Radivojac,et al.  MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing , 2014, Genome Biology.

[77]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[78]  Kinji Ohno,et al.  In vitro and in silico analysis reveals an efficient algorithm to predict the splicing consequences of mutations at the 5′ splice sites , 2007, Nucleic acids research.

[79]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[80]  Luciano Milanesi,et al.  Analysis of donor splice sites in different eukaryotic organisms , 1997, Journal of Molecular Evolution.

[81]  Thomas Blumenthal,et al.  Both subunits of U2AF recognize the 3′ splice site in Caenorhabditis elegans , 1999, Nature.

[82]  A. Takagi,et al.  Novel compound heterozygous mutations for lipoprotein lipase deficiency. A G-to-T transversion at the first position of exon 5 causing G154V missense mutation and a 5' splice site mutation of intron 8. , 2001, Journal of lipid research.

[83]  J. G. Patton,et al.  Scanning from an independently specified branch point defines the 3′ splice site of mammalian introns , 1989, Nature.

[84]  G. Ast,et al.  Alternative splicing and evolution: diversification, exon definition and function , 2010, Nature Reviews Genetics.

[85]  M. Carmo-Fonseca,et al.  In Vivo Requirement of the Small Subunit of U2AF for Recognition of a Weak 3′ Splice Site , 2006, Molecular and Cellular Biology.

[86]  Petr Divina,et al.  Ab initio prediction of mutation-induced cryptic splice-site activation and exon skipping , 2009, European Journal of Human Genetics.

[87]  A. Masuda,et al.  AG-dependent 3′-splice sites are predisposed to aberrant splicing due to a mutation at the first nucleotide of an exon , 2011, Nucleic acids research.

[88]  Michael R. Green,et al.  Functional recognition of the 3′ splice site AG by the splicing factor U2AF35 , 1999, Nature.

[89]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[90]  K. Ohno,et al.  Spectrum of splicing errors caused by CHRNE mutations affecting introns and intron/exon boundaries , 2005, Journal of Medical Genetics.

[91]  Allison J. Taggart,et al.  Large-scale analysis of branchpoint usage across species and cell lines. , 2017, Genome research.

[92]  A. Krainer,et al.  Widespread recognition of 5' splice sites by noncanonical base-pairing to U1 snRNA involving bulged nucleotides. , 2012, Genes & development.

[93]  Gene W. Yeo,et al.  Systematic Identification and Analysis of Exonic Splicing Silencers , 2004, Cell.

[94]  Jorng-Tzong Horng,et al.  An enhanced computational platform for investigating the roles of regulatory RNA and for identifying functional RNA motifs , 2013, BMC Bioinformatics.

[95]  T. Sato,et al.  An exonic mutation of the GH‐1 gene causing familial isolated growth hormone deficiency type II , 2002, Clinical genetics.

[96]  A. Masuda,et al.  Decoding abnormal splicing code in human diseases , 2015 .

[97]  Adrian R Krainer,et al.  Recognition of atypical 5' splice sites by shifted base-pairing to U1 snRNA , 2008, Nature Structural &Molecular Biology.

[98]  T. D. Schneider,et al.  Information analysis of human splice site mutations , 1998, Human mutation.

[99]  A. Masuda,et al.  IntSplice: prediction of the splicing consequences of intronic single-nucleotide variations in the human genome , 2016, Journal of Human Genetics.

[100]  Kinji Ohno,et al.  Human branch point consensus sequence is yUnAy , 2008, Nucleic acids research.

[101]  Yael Mandel-Gutfreund,et al.  RBPmap: a web server for mapping binding sites of RNA-binding proteins , 2014, Nucleic Acids Res..

[102]  A. Zanella,et al.  A variant of the EPB3 gene of the anti‐Lepore type in hereditary spherocytosis , 1997, British journal of haematology.

[103]  David Haussler,et al.  Improved splice site detection in Genie , 1997, RECOMB '97.

[104]  Francesco Piva,et al.  SpliceAid: a database of experimental RNA target motifs bound by splicing proteins in humans , 2009, Bioinform..

[105]  A. Masuda,et al.  Tannic acid facilitates expression of the polypyrimidine tract binding protein and alleviates deleterious inclusion of CHRNA1 exon P3A due to an hnRNP H-disrupting mutation in congenital myasthenic syndrome , 2009, Human molecular genetics.

[106]  J. G. Patton,et al.  Alpha-tropomyosin mutually exclusive exon selection: competition between branchpoint/polypyrimidine tracts determines default exon choice. , 1991, Genes & development.