Alu Exonization Events Reveal Features Required for Precise Recognition of Exons by the Splicing Machinery

Despite decades of research, the question of how the mRNA splicing machinery precisely identifies short exonic islands within the vast intronic oceans remains to a large extent obscure. In this study, we analyzed Alu exonization events, aiming to understand the requirements for correct selection of exons. Comparison of exonizing Alus to their non-exonizing counterparts is informative because Alus in these two groups have retained high sequence similarity but are perceived differently by the splicing machinery. We identified and characterized numerous features used by the splicing machinery to discriminate between Alu exons and their non-exonizing counterparts. Of these, the most novel is secondary structure: Alu exons in general and their 5′ splice sites (5′ss) in particular are characterized by decreased stability of local secondary structures with respect to their non-exonizing counterparts. We detected numerous further differences between Alu exons and their non-exonizing counterparts, among others in terms of exon–intron architecture and strength of splicing signals, enhancers, and silencers. Support vector machine analysis revealed that these features allow a high level of discrimination (AUC = 0.91) between exonizing and non-exonizing Alus. Moreover, the computationally derived probabilities of exonization significantly correlated with the biological inclusion level of the Alu exons, and the model could also be extended to general datasets of constitutive and alternative exons. This indicates that the features detected and explored in this study provide the basis not only for precise exon selection but also for the fine-tuned regulation thereof, manifested in cases of alternative splicing.

[1]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[2]  P. Baldi,et al.  The architecture of pre-mRNAs affects mechanisms of splice-site pairing. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[4]  G. Ast,et al.  Alternative splicing of Alu exons—two arms are better than one , 2008, Nucleic acids research.

[5]  Ron Shamir,et al.  A non-EST-based method for exon-skipping prediction. , 2004, Genome research.

[6]  Gil Ast,et al.  How did alternative splicing evolve? , 2004, Nature Reviews Genetics.

[7]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[8]  M. Hiller,et al.  Using RNA secondary structures to guide sequence motif finding towards single-stranded regions , 2006, Nucleic acids research.

[9]  M. Blanchette,et al.  A highly stable duplex structure sequesters the 5' splice site region of hnRNP A1 alternative exon 7B. , 1997, RNA.

[10]  E. Androphy,et al.  Modulating role of RNA structure in alternative splicing of a critical exon in the spinal muscular atrophy genes , 2006, Nucleic acids research.

[11]  Marvin B. Shapiro,et al.  RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. , 1987, Nucleic acids research.

[12]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[13]  J. Královičová,et al.  Global control of aberrant splice-site activation by auxiliary splicing sequences: evidence for a gradient in exon and intron definition , 2007, Nucleic acids research.

[14]  Peter J. Shepard,et al.  Conserved RNA secondary structures promote alternative splicing. , 2008, RNA.

[15]  E. Buratti,et al.  Influence of RNA Secondary Structure on the Pre-mRNA Splicing Process , 2004, Molecular and Cellular Biology.

[16]  K. Heller,et al.  Sequence information for the splicing of human pre-mRNA identified by support vector machine classification. , 2003, Genome research.

[17]  Dan Graur,et al.  Alu-containing exons are alternatively spliced. , 2002, Genome research.

[18]  A. Krainer,et al.  Pre-mRNA splicing in the new millennium. , 2001, Current opinion in cell biology.

[19]  E. Buratti,et al.  RNA structure is a key regulatory element in pathological ATM and CFTR pseudoexon inclusion events , 2007, Nucleic acids research.

[20]  G. Ast,et al.  Alternative splicing: current perspectives , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[21]  D. Burstein,et al.  Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes. , 2007, Genome research.

[22]  Christopher J. Lee,et al.  The effect of intron length on exon creation ratios during the evolution of mammalian genomes. , 2008, RNA.

[23]  André Corvelo,et al.  Exon creation and establishment in human genes , 2008, Genome Biology.

[24]  Agnes Hotz-Wagenblatt,et al.  Comparative analysis of transposed element insertion within human and mouse genomes reveals Alu's unique role in shaping the human transcriptome , 2007, Genome Biology.

[25]  G. Ast,et al.  Comparative analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers and silencers. , 2006, Molecular cell.

[26]  Dan Graur,et al.  Minimal conditions for exonization of intronic sequences: 5' splice site formation in alu exons. , 2004, Molecular cell.

[27]  M. Garcia-Blanco,et al.  A mutational analysis of the polypyrimidine tract of introns. Effects of sequence differences in pyrimidine tracts on splicing. , 1993, The Journal of biological chemistry.

[28]  C. Burd,et al.  hnRNP proteins and the biogenesis of mRNA. , 1993, Annual review of biochemistry.

[29]  Michael Q. Zhang,et al.  Distribution of SR protein exonic splicing enhancer motifs in human protein-coding genes , 2005, Nucleic acids research.

[30]  Francisco E. Baralle,et al.  Regulation of Fibronectin EDA Exon Alternative Splicing: Possible Role of RNA Secondary Structure for Enhancer Display , 1999, Molecular and Cellular Biology.

[31]  István Miklós,et al.  Statistical evidence for conserved, local secondary structure in the coding regions of eukaryotic mRNAs and pre-mRNAs , 2005, Nucleic acids research.

[32]  Noam Shomron,et al.  The Birth of an Alternatively Spliced Exon: 3' Splice-Site Selection in Alu Exons , 2003, Science.

[33]  I. Graham,et al.  Effects of RNA secondary structure on alternative splicing of Pre-mRNA: Is folding limited to a region behind the transcribing RNA polymerase? , 1988, Cell.

[34]  D. Black Mechanisms of alternative pre-messenger RNA splicing. , 2003, Annual review of biochemistry.

[35]  Ron Shamir,et al.  Accurate identification of alternatively spliced exons using support vector machine , 2005, Bioinform..

[36]  D. Ropers,et al.  Conserved stem-loop structures in the HIV-1 RNA region containing the A3 3' splice site and its cis-regulatory element: possible involvement in RNA splicing. , 2001, Nucleic acids research.

[37]  D. Libri,et al.  Tissue-specific splicing in vivo of the beta-tropomyosin gene: dependence on an RNA secondary structure. , 1991, Science.

[38]  L. Chasin,et al.  Computational definition of sequence motifs governing constitutive exon splicing. , 2004, Genes & development.

[39]  L. Chasin,et al.  Comparison of multiple vertebrate genomes reveals the birth and evolution of human exons , 2006, Proceedings of the National Academy of Sciences.

[40]  Christopher J. Lee,et al.  Alternative splicing and RNA selection pressure — evolutionary consequences for eukaryotic genomes , 2006, Nature Reviews Genetics.

[41]  Peter F Stadler,et al.  Fast and reliable prediction of noncoding RNAs , 2005, Proc. Natl. Acad. Sci. USA.

[42]  Jinhua Wang,et al.  ESEfinder: a web resource to identify exonic splicing enhancers , 2003, Nucleic Acids Res..

[43]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[44]  S. Berget Exon Recognition in Vertebrate Splicing (*) , 1995, The Journal of Biological Chemistry.

[45]  J. Brosius,et al.  Functional persistence of exonized mammalian-wide interspersed repeat elements (MIRs). , 2007, Genome research.

[46]  C. Burge,et al.  Widespread selection for local RNA secondary structure in coding regions of bacterial genes. , 2003, Genome research.

[47]  R. Sachidanandam,et al.  KRAINER splice sites ′ Determinants of the inherent strength of human 5 , 2005 .

[48]  A. Grover,et al.  5′ Splice Site Mutations in tau Associated with the Inherited Dementia FTDP-17 Affect a Stem-Loop Structure That Regulates Alternative Splicing of Exon 10* , 1999, The Journal of Biological Chemistry.

[49]  R. Amann,et al.  Predictive Identification of Exonic Splicing Enhancers in Human Genes , 2022 .

[50]  Asaf Levy,et al.  TranspoGene and microTranspoGene: transposed elements influence on the transcriptome of seven vertebrates and invertebrates , 2007, Nucleic Acids Res..

[51]  A. Mighell,et al.  Alu sequences , 1997, FEBS letters.

[52]  E. Brody,et al.  RNA secondary structure repression of a muscle-specific exon in HeLa cell nuclear extracts. , 1991, Science.

[53]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[54]  D. Solnick Alternative splicing caused by RNA secondary structure , 1985, Cell.

[55]  B. Chabot,et al.  High-affinity hnRNP A1 binding sites and duplex-forming inverted repeats have similar effects on 5' splice site selection in support of a common looping out and repression mechanism. , 2002, RNA.

[56]  G. Ast,et al.  Multifactorial Interplay Controls the Splicing Profile of Alu-Derived Exons , 2008, Molecular and Cellular Biology.

[57]  J. Jurka,et al.  Evolutionary history of 7SL RNA-derived SINEs in Supraprimates. , 2007, Trends in genetics : TIG.

[58]  Rolf Backofen,et al.  Pre-mRNA Secondary Structures Influence Exon Recognition , 2007, PLoS genetics.

[59]  Gene W. Yeo,et al.  Systematic Identification and Analysis of Exonic Splicing Silencers , 2004, Cell.

[60]  Robin B. Gasser,et al.  A hitchhiker's guide to expressed sequence tag (EST) analysis , 2006, Briefings Bioinform..

[61]  P. Shepard,et al.  Conserved RNA secondary structures promote alternative splicing data , 2022 .