Inference of Splicing Regulatory Activities by Sequence Neighborhood Analysis

Sequence-specific recognition of nucleic-acid motifs is critical to many cellular processes. We have developed a new and general method called Neighborhood Inference (NI) that predicts sequences with activity in regulating a biochemical process based on the local density of known sites in sequence space. Applied to the problem of RNA splicing regulation, NI was used to predict hundreds of new exonic splicing enhancer (ESE) and silencer (ESS) hexanucleotides from known human ESEs and ESSs. These predictions were supported by cross-validation analysis, by analysis of published splicing regulatory activity data, by sequence-conservation analysis, and by measurement of the splicing regulatory activity of 24 novel predicted ESEs, ESSs, and neutral sequences using an in vivo splicing reporter assay. These results demonstrate the ability of NI to accurately predict splicing regulatory activity and show that the scope of exonic splicing regulatory elements is substantially larger than previously anticipated. Analysis of orthologous exons in four mammals showed that the NI score of ESEs, a measure of function, is much more highly conserved above background than ESE primary sequence. This observation indicates a high degree of selection for ESE activity in mammalian exons, with surprisingly frequent interchangeability between ESE sequences.

[1]  J. Ghysdael,et al.  Identification of nucleotide preferences in DNA sequences recognised specifically by c-Ets-1 protein. , 1992, Nucleic acids research.

[2]  S. Orkin,et al.  DNA-binding specificity of GATA family transcription factors , 1993, Molecular and cellular biology.

[3]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[4]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[5]  J. Manley,et al.  The human splicing factors ASF/SF2 and SC35 possess distinct, functionally significant RNA binding specificities. , 1995, The EMBO journal.

[6]  T. Cooper,et al.  Identification of a new class of exonic splicing enhancers by in vivo selection , 1997, Molecular and cellular biology.

[7]  A. Krainer,et al.  Identification of Functional Exonic Splicing Enhancer Motifs Recognized by Individual Sr Proteins Using an in Vitro Randomization and Functional Selection Procedure, We Have Identified Three Novel Classes of Exonic Splicing Enhancers (eses) Recognized by Human Sf2/asf, Srp40, and Srp55, Respectively , 2022 .

[8]  D. Helfman,et al.  Binding of hnRNP H to an exonic splicing silencer is involved in the regulation of alternative splicing of the rat beta-tropomyosin gene. , 1999, Genes & development.

[9]  T. Maniatis,et al.  Multiple Distinct Splicing Enhancers in the Protein-Coding Sequences of a Constitutively Spliced Pre-mRNA , 1999, Molecular and Cellular Biology.

[10]  M. Olive,et al.  hnRNP A1 Recruited to an Exon In Vivo Can Function as an Exon Splicing Silencer , 1999, Molecular and Cellular Biology.

[11]  Tom Maniatis,et al.  Selection and Characterization of Pre-mRNA Splicing Enhancers: Identification of Novel SR Protein-Specific Enhancer Sequences , 1999, Molecular and Cellular Biology.

[12]  B. Blencowe Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. , 2000, Trends in biochemical sciences.

[13]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[14]  Michael Q. Zhang,et al.  A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes , 2001, Nature Genetics.

[15]  G. Stormo,et al.  Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay. , 2001, Nucleic acids research.

[16]  P. Bucher,et al.  High-throughput SELEX–SAGE method for quantitative modeling of transcription-factor binding sites , 2002, Nature Biotechnology.

[17]  E. van Nimwegen,et al.  Probabilistic clustering of sequences: Inferring new bacterial regulons by comparative genomics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Jinhua Wang,et al.  ESEfinder: a web resource to identify exonic splicing enhancers , 2003, Nucleic Acids Res..

[19]  I. Simon,et al.  Program-Specific Distribution of a Transcription Factor Dependent on Partner Transcription Factor and MAPK Signaling , 2003, Cell.

[20]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[21]  D. Black Mechanisms of alternative pre-messenger RNA splicing. , 2003, Annual review of biochemistry.

[22]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[23]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[24]  Gene W. Yeo,et al.  Systematic Identification and Analysis of Exonic Splicing Silencers , 2004, Cell.

[25]  Dirk Holste,et al.  Single Nucleotide Polymorphism–Based Validation of Exonic Splicing Enhancers , 2004, PLoS biology.

[26]  Gene W. Yeo,et al.  Variation in sequence and organization of splicing regulatory elements in vertebrate genes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[27]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[28]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[29]  L. Chasin,et al.  Computational definition of sequence motifs governing constitutive exon splicing. , 2004, Genes & development.

[30]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[31]  B. Tian,et al.  Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation. , 2005, RNA.

[32]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[33]  O. Mühlemann,et al.  Alternative splicing induced by nonsense mutations in the immunoglobulin mu VDJ exon is independent of truncation of the open reading frame. , 2005, RNA.

[34]  F. Clark,et al.  Understanding alternative splicing: towards a cellular code , 2005, Nature Reviews Molecular Cell Biology.

[35]  Eric G. Daub,et al.  Quantifying optimal accuracy of local primary sequence bioinformatics methods , 2005, Bioinform..

[36]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[37]  P. Green,et al.  Sequence conservation, relative isoform frequencies, and nonsense-mediated decay in evolutionarily conserved alternative splicing. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Joseph R. Ecker,et al.  Corrigendum to ‘‘Applications of DNA tiling arrays for whole-genome analysis’’ [Genomics 85 (2005) 1–15] , 2005 .

[39]  G. Stormo,et al.  Combining SELEX with quantitative assays to rapidly obtain accurate models of protein–DNA interactions , 2005, Nucleic acids research.

[40]  L. Hurst,et al.  Evidence for purifying selection against synonymous mutations in mammalian exonic splicing enhancers. , 2006, Molecular biology and evolution.

[41]  G. Ast,et al.  Comparative analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers and silencers. , 2006, Molecular cell.

[42]  Zefeng Wang,et al.  General and specific functions of exonic splicing silencers in splicing control. , 2006, Molecular cell.

[43]  R. Amann,et al.  Predictive Identification of Exonic Splicing Enhancers in Human Genes , 2022 .