Comparative analysis of sequence features involved in the recognition of tandem splice sites

BackgroundThe splicing of pre-mRNAs is conspicuously often variable and produces multiple alternatively spliced (AS) isoforms that encode different messages from one gene locus. Computational studies uncovered a class of highly similar isoforms, which were related to tandem 5'-splice sites (5'ss) and 3'-splice sites (3'ss), yet with very sparse anecdotal evidence in experimental studies. To compare the types and levels of alternative tandem splice site exons occurring in different human organ systems and cell types, and to study known sequence features involved in the recognition and distinction of neighboring splice sites, we performed large-scale, stringent alignments of cDNA sequences and ESTs to the human and mouse genomes, followed by experimental validation.ResultsWe analyzed alternative 5'ss exons (A5Es) and alternative 3'ss exons (A3Es), derived from transcript sequences that were aligned to assembled genome sequences to infer patterns of AS occurring in several thousands of genes. Comparing the levels of overlapping (tandem) and non-overlapping (competitive) A5Es and A3Es, a clear preference of isoforms was seen for tandem acceptors and donors, with four nucleotides and three to six nucleotides long exon extensions, respectively. A subset of inferred A5E tandem exons was selected and experimentally validated. With the focus on A5Es, we investigated their transcript coverage, sequence conservation and base-paring to U1 snRNA, proximal and distal splice site classification, candidate motifs for cis-regulatory activity, and compared A5Es with A3Es, constitutive and pseudo-exons, in H. sapiens and M. musculus. The results reveal a small but authentic enriched set of tandem splice site preference, with specific distances between proximal and distal 5'ss (3'ss), which showed a marked dichotomy between the levels of in- and out-of-frame splicing for A5Es and A3Es, respectively, identified a number of candidate NMD targets, and allowed a rough estimation of a number of undetected tandem donors based on splice site information.ConclusionThis comparative study distinguishes tandem 5'ss and 3'ss, with three to six nucleotides long extensions, as having unusually high proportions of AS, experimentally validates tandem donors in a panel of different human tissues, highlights the dichotomy in the types of AS occurring at tandem splice sites, and elucidates that human alternative exons spliced at overlapping 5'ss posses features of typical splice variants that could well be beneficial for the cell.

[1]  Ravi Sachidanandam,et al.  Determinants of the inherent strength of human 5' splice sites. , 2005, RNA.

[2]  T. Cooper,et al.  Finding signals that regulate alternative splicing in the post-genomic era , 2002, Genome Biology.

[3]  David G. Stork,et al.  Pattern Classification , 1973 .

[4]  E. Buratti,et al.  Influence of RNA Secondary Structure on the Pre-mRNA Splicing Process , 2004, Molecular and Cellular Biology.

[5]  J. Nicklas,et al.  Mutations that alter RNA splicing of the human HPRT gene: a review of the spectrum. , 1998, Mutation research.

[6]  Steven E. Brenner,et al.  Widespread predicted nonsense-mediated mRNA decay of alternatively-spliced transcripts of human normal and disease genes , 2003, ISMB.

[7]  P Bork,et al.  EST comparison indicates 38% of human mRNAs contain possible alternative splice forms , 2000, FEBS letters.

[8]  Gene W. Yeo,et al.  Systematic Identification and Analysis of Exonic Splicing Silencers , 2004, Cell.

[9]  Melissa S Jurica,et al.  Pre-mRNA splicing: awash in a sea of proteins. , 2003, Molecular cell.

[10]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[11]  Terry Gaasterland,et al.  Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome. , 2003, Genome research.

[12]  Jun Kawai,et al.  A Simple Physical Model Predicts Small Exon Length Variations , 2006, PLoS genetics.

[13]  E. Wang,et al.  PLP/DM20 ratio is regulated by hnRNPH and F and a novel G-rich enhancer in oligodendrocytes , 2007, Nucleic acids research.

[14]  Jørgen Kjems,et al.  Defining a 5' splice site by functional selection in the presence and absence of U1 snRNA 5' end. , 2002, RNA.

[15]  David Botstein,et al.  SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data , 2003, Nucleic Acids Res..

[16]  S. Leu,et al.  Over-expression of SR-cyclophilin, an interaction partner of nuclear pinin, releases SR family splicing factors from nuclear speckles. , 2004, Biochemical and biophysical research communications.

[17]  Gil Ast,et al.  The Emergence of Alternative 3′ and 5′ Splice Site Exons from Constitutive Exons , 2007, PLoS Comput. Biol..

[18]  C. Guthrie,et al.  An RNA switch at the 5' splice site requires ATP and the DEAD box protein Prp28p. , 1999, Molecular cell.

[19]  Alexandra Paillusson,et al.  EJC-independent degradation of nonsense immunoglobulin-μ mRNA depends on 3′ UTR length , 2006, Nature Structural &Molecular Biology.

[20]  Gil Ast,et al.  Comparative analysis detects dependencies among the 5' splice-site positions. , 2004, RNA.

[21]  Christopher J. Lee,et al.  Genome-wide detection of alternative splicing in expressed sequences of human genes , 2001, Nucleic Acids Res..

[22]  Christopher B. Burge,et al.  Hollywood: a comparative relational database of alternative splicing , 2005, Nucleic Acids Res..

[23]  Uwe Ohler,et al.  Strategies for Identifying RNA Splicing Regulatory Motifs and Predicting Alternative Splicing Events , 2008, PLoS Comput. Biol..

[24]  M. Hiller,et al.  Using RNA secondary structures to guide sequence motif finding towards single-stranded regions , 2006, Nucleic acids research.

[25]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[26]  Gene W. Yeo,et al.  Variation in alternative splicing across human tissues , 2004, Genome Biology.

[27]  M. Gelfand,et al.  Frequent alternative splicing of human genes. , 1999, Genome research.

[28]  G. Ast,et al.  Stress alters the subcellular distribution of hSlu7 and thus modulates alternative splicing , 2005, Journal of Cell Science.

[29]  M. Blanchette,et al.  An intron element modulating 5' splice site selection in the hnRNP A1 pre-mRNA interacts with hnRNP A1 , 1997, Molecular and cellular biology.

[30]  M. Blanchette,et al.  Modulation of exon skipping by high‐affinity hnRNP A1‐binding sites and by intron elements that repress splice site utilization , 1999, The EMBO journal.

[31]  S. Rosenberg,et al.  Utilization of an alternative open reading frame of a normal gene in generating a novel human cancer antigen , 1996, The Journal of experimental medicine.

[32]  M. Mckeown Regulation of alternative splicing. , 1990, Genetic engineering.

[33]  A. Krainer,et al.  Selection of Alternative 5′ Splice Sites: Role of U1 snRNP and Models for the Antagonistic Effects of SF2/ASF and hnRNP A1 , 2000, Molecular and Cellular Biology.

[34]  G. Untergasser,et al.  Complex alternative splicing of the GH-V gene in the human testis. , 1998, European journal of endocrinology.

[35]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.

[36]  N. Shastri,et al.  Producing nature's gene-chips: the generation of peptides for display by MHC class I molecules. , 2002, Annual review of immunology.

[37]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[38]  Ronald C. Petersen,et al.  Association of missense and 5′-splice-site mutations in tau with the inherited dementia FTDP-17 , 1998, Nature.

[39]  P. Green,et al.  Sequence conservation, relative isoform frequencies, and nonsense-mediated decay in evolutionarily conserved alternative splicing. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[40]  G. Screaton,et al.  Efficient use of a ‘dead‐end’ GA 5′ splice site in the human fibroblast growth factor receptor genes , 2003, The EMBO journal.

[41]  M. Hagiwara,et al.  Novel SR-rich-related Protein Clasp Specifically Interacts with Inactivated Clk4 and Induces the Exon EB Inclusion of Clk* , 2002, The Journal of Biological Chemistry.

[42]  Gary D. Stormo,et al.  Displaying the information contents of structural RNA alignments: the structure logos , 1997, Comput. Appl. Biosci..

[43]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[44]  K. Weinberg,et al.  Novel splicing, missense, and deletion mutations in seven adenosine deaminase-deficient patients with late/delayed onset of combined immunodeficiency disease. Contribution of genotype to phenotype. , 1993, The Journal of clinical investigation.

[45]  Gil Ast,et al.  How did alternative splicing evolve? , 2004, Nature Reviews Genetics.

[46]  A. Krainer,et al.  Pathways for selection of 5′ splice sites by U1 snRNPs and SF2/ASF. , 1993, The EMBO journal.

[47]  J. Yewdell,et al.  Making sense of mass destruction: quantitating MHC class I antigen presentation , 2003, Nature Reviews Immunology.

[48]  Jean-François Fisette,et al.  Intronic Binding Sites for hnRNP A/B and hnRNP F/H Proteins Stimulate Pre-mRNA Splicing , 2006, PLoS biology.

[49]  F. Broackes-Carter,et al.  Alternative 5' exons of the CFTR gene show developmental regulation. , 2003, Human molecular genetics.

[50]  Brenton R Graveley,et al.  A computational and experimental approach toward a priori identification of alternatively spliced exons. , 2004, RNA.

[51]  Ron Shamir,et al.  A non-EST-based method for exon-skipping prediction. , 2004, Genome research.

[52]  Gene W. Yeo,et al.  Discovery and Analysis of Evolutionarily Conserved Intronic Splicing Regulatory Elements , 2007, PLoS Genetics.

[53]  S. Amselem,et al.  Oriented Scanning Is the Leading Mechanism Underlying 5′ Splice Site Selection in Mammals , 2006, PLoS genetics.

[54]  Ravi Sachidanandam,et al.  Intrinsic differences between authentic and cryptic 5' splice sites. , 2003, Nucleic acids research.

[55]  P. Kloetzel Generation of major histocompatibility complex class I antigens: functional interplay between proteasomes and TPPII , 2004, Nature Immunology.

[56]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[57]  Yimeng Dou,et al.  Genomic splice-site analysis reveals frequent alternative splicing close to the dominant splice site. , 2006, RNA.

[58]  Peer Bork,et al.  SMART 5: domains in the context of genomes and networks , 2005, Nucleic Acids Res..

[59]  E. J. de la Rosa,et al.  Developmental regulation of a proinsulin messenger RNA generated by intron retention , 2005, EMBO reports.

[60]  Jørgen Kjems,et al.  A novel approach to describe a U1 snRNA binding site. , 2003, Nucleic acids research.

[61]  R. Sorek,et al.  Intronic sequences flanking alternatively spliced exons are conserved between human and mouse. , 2003, Genome research.

[62]  Minghui Zhang,et al.  Cloning and functional characterization of ACAD-9, a novel member of human acyl-CoA dehydrogenase family. , 2002, Biochemical and biophysical research communications.

[63]  S V Buldyrev,et al.  Optimization of coding potentials using positional dependence of nucleotide frequencies. , 2000, Journal of theoretical biology.

[64]  Rolf Backofen,et al.  Widespread occurrence of alternative splicing at NAGNAG acceptors contributes to proteome plasticity , 2004, Nature Genetics.

[65]  R. Durbin,et al.  Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs. , 1999, Genome research.

[66]  B. Graveley Sex, AGility, and the Regulation of Alternative Splicing , 2002, Cell.

[67]  G. Rubin,et al.  A computer program for aligning a cDNA sequence with a genomic DNA sequence. , 1998, Genome research.

[68]  Zefeng Wang,et al.  General and specific functions of exonic splicing silencers in splicing control. , 2006, Molecular cell.

[69]  S. Berget Exon Recognition in Vertebrate Splicing (*) , 1995, The Journal of Biological Chemistry.

[70]  B. Graveley Alternative splicing: increasing diversity in the proteomic world. , 2001, Trends in genetics : TIG.

[71]  Christopher J. Lee,et al.  Alternative splicing and RNA selection pressure — evolutionary consequences for eukaryotic genomes , 2006, Nature Reviews Genetics.

[72]  L. Maquat Nonsense-mediated mRNA decay: splicing, translation and mRNP dynamics , 2004, Nature Reviews Molecular Cell Biology.

[73]  A. Zahler,et al.  Determination of the RNA Binding Specificity of the Heterogeneous Nuclear Ribonucleoprotein (hnRNP) H/H′/F/2H9 Family* , 2001, The Journal of Biological Chemistry.

[74]  C. Gooding,et al.  Autoregulation of polypyrimidine tract binding protein by alternative splicing leading to nonsense-mediated decay. , 2004, Molecular cell.

[75]  David Haussler,et al.  Transcriptome and Genome Conservation of Alternative Splicing Events in Humans and Mice , 2003, Pacific Symposium on Biocomputing.

[76]  Kathryn F. Beal,et al.  The Staden package, 1998. , 2000, Methods in molecular biology.

[77]  S. Brenner,et al.  Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[78]  E. Kinney Primer of Biostatistics , 1987 .

[79]  Xuegong Zhang,et al.  The effect of U1 snRNA binding free energy on the selection of 5' splice sites. , 2005, Biochemical and biophysical research communications.

[80]  Tomaso Poggio,et al.  Identification and analysis of alternative splicing events conserved in human and mouse. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[81]  Miao Zhang,et al.  Improved spliced alignment from an information theoretic approach , 2006, Bioinform..

[82]  Yael Mandel-Gutfreund,et al.  Alternative splicing regulation at tandem 3′ splice sites , 2006, Nucleic acids research.

[83]  Susan M. Berget,et al.  An Intronic Splicing Enhancer Binds U1 snRNPs To Enhance Splicing and Select 5′ Splice Sites , 2000, Molecular and Cellular Biology.

[84]  Christopher J. Lee,et al.  Genome-wide detection of tissue-specific alternative splicing in the human transcriptome. , 2002, Nucleic acids research.

[85]  L. Chasin,et al.  Computational definition of sequence motifs governing constitutive exon splicing. , 2004, Genes & development.