Bayesian prediction of tissue-regulated splicing using RNA sequence and cellular context

MOTIVATION Alternative splicing is a major contributor to cellular diversity in mammalian tissues and relates to many human diseases. An important goal in understanding this phenomenon is to infer a 'splicing code' that predicts how splicing is regulated in different cell types by features derived from RNA, DNA and epigenetic modifiers. METHODS We formulate the assembly of a splicing code as a problem of statistical inference and introduce a Bayesian method that uses an adaptively selected number of hidden variables to combine subgroups of features into a network, allows different tissues to share feature subgroups and uses a Gibbs sampler to hedge predictions and ascertain the statistical significance of identified features. RESULTS Using data for 3665 cassette exons, 1014 RNA features and 4 tissue types derived from 27 mouse tissues (http://genes.toronto.edu/wasp), we benchmarked several methods. Our method outperforms all others, and achieves relative improvements of 52% in splicing code quality and up to 22% in classification error, compared with the state of the art. Novel combinations of regulatory features and novel combinations of tissues that share feature subgroups were identified using our method. CONTACT frey@psi.toronto.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[2]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[3]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[4]  R. C. Chan,et al.  The polypyrimidine tract binding protein binds upstream of neural cell-specific c-src exon N1 to repress the splicing of the intron downstream , 1997, Molecular and cellular biology.

[5]  P. Sharp,et al.  Alternative Splicing of the Fibronectin EIIIB Exon Depends on Specific TGCATG Repeats , 1998, Molecular and Cellular Biology.

[6]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[7]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[8]  J. S. Rao,et al.  Spike and Slab Gene Selection for Multigroup Microarray Data , 2005 .

[9]  B. Blencowe Alternative Splicing: New Insights from Global Analyses , 2006, Cell.

[10]  B. Frey,et al.  Functional coordination of alternative splicing in the mammalian central nervous system , 2007, Genome Biology.

[11]  Gene W. Yeo,et al.  Discovery and Analysis of Evolutionarily Conserved Intronic Splicing Regulatory Elements , 2007, PLoS Genetics.

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Guey-Shin Wang,et al.  Splicing in disease: disruption of the splicing code and the decoding machinery , 2007, Nature Reviews Genetics.

[14]  C. Burge,et al.  integrated splicing code Splicing regulation : From a parts list of regulatory elements to an , 2022 .

[15]  Tyson A. Clark,et al.  HITS-CLIP yields genome-wide insights into brain alternative RNA processing , 2008, Nature.

[16]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[17]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[18]  Lourdes Peña Castillo,et al.  Rapid and systematic analysis of the RNA recognition specificities of RNA-binding proteins , 2009, Nature Biotechnology.

[19]  B. Hartmann,et al.  Decrypting the genome's alternative messages. , 2009, Current opinion in cell biology.

[20]  Brendan J. Frey,et al.  Deciphering the splicing code , 2010, Nature.

[21]  J. Fak,et al.  Chaolin Zhang and Its Combinatorial Controls Integrative Modeling Defines the Nova Splicing-Regulatory Network , 2013 .

[22]  Brendan J. Frey,et al.  Model-based detection of alternative splicing signals , 2010, Bioinform..

[23]  R. F. Luco,et al.  Epigenetics in Alternative Pre-mRNA Splicing , 2011, Cell.