Accurate Prediction of Alternatively Spliced Cassette Exons Using Evolutionary Conservation Information and Logitlinear Model

Accurate prediction of alternative splicing (AS) is important for understanding the mechanism of gene regulation and studying the pathogenesis of diseases. By analyzing the evolutionary conservation information in alternatively spliced cassette exons (SCEs), constitutively spliced exons (CSEs) and their flanking intronic regions, we found that the average conservation score (AConScore) distributions of SCEs are significantly different from those of CSEs. We further integrated these unique features into the logitlinear model, and constructed a classifier, named AEDetector, for distinguishing SCEs from CSEs. The five-fold cross validation results show that the sensitivity and specificity of AEDetector are higher than those recently reported on the same dataset.

[1]  Feng-Chi Chen,et al.  Alternatively and constitutively spliced exons are subject to different evolutionary forces. , 2006, Molecular biology and evolution.

[2]  Yi Xing,et al.  Genomic analysis of RNA alternative splicing in cancers. , 2007, Frontiers in bioscience : a journal and virtual library.

[3]  Gene W. Yeo,et al.  Variation in alternative splicing across human tissues , 2004, Genome Biology.

[4]  Ron Shamir,et al.  Accurate identification of alternatively spliced exons using support vector machine , 2005, Bioinform..

[5]  Sergey A Lukyanov,et al.  Human Trash ESTs - Sequences from cDNA Collection that are not Aligned to genome Assembly , 2008, J. Bioinform. Comput. Biol..

[6]  Qian-zhong Li,et al.  One parameter to describe the mechanism of splice sites competition. , 2008, Biochemical and biophysical research communications.

[7]  Christopher B. Burge,et al.  Recognition of Unknown Conserved Alternatively Spliced Exons , 2005, PLoS Comput. Biol..

[8]  Ron Shamir,et al.  A non-EST-based method for exon-skipping prediction. , 2004, Genome research.

[9]  Thangavel Alphonse Thanaraj,et al.  ASD: the Alternative Splicing Database , 2004, Nucleic Acids Res..

[10]  Steven Salzberg,et al.  A phylogenetic generalized hidden Markov model for predicting alternatively spliced exons , 2006, Algorithms for Molecular Biology.

[11]  M. Gelfand,et al.  Comparative genomics and evolution of alternative splicing: the pessimists' science. , 2007, Chemical reviews.

[12]  Pavol Hanus,et al.  Local conservation scores without a priori assumptions on neutral substitution rates , 2008, BMC Bioinformatics.

[13]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[14]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[15]  Antonio Marín,et al.  Characterization and prediction of alternative splice sites. , 2006, Gene.

[16]  Paola Bonizzoni,et al.  Computational Methods for Alternative Splicing Prediction , 2006 .

[17]  Gunnar Rätsch,et al.  RASE: recognition of alternatively spliced exons in C.elegans , 2005, ISMB.

[18]  Mikhail A. Roytberg,et al.  Analysis of Sequence Conservation at Nucleotide Resolution , 2007, PLoS Comput. Biol..

[19]  Liang Chen,et al.  Identify Alternative Splicing Events Based on Position-Specific Evolutionary Conservation , 2008, PloS one.

[20]  S. Grellscheid,et al.  Applying genetic programming to the prediction of alternative mRNA splice variants. , 2007, Genomics.

[21]  S. D. de Souza,et al.  Alternative splicing: a bioinformatics perspective. , 2007, Molecular bioSystems.

[22]  V. Brendel,et al.  Logitlinear models for the prediction of splice sites in plant pre-mRNA sequences. , 1996, Nucleic acids research.

[23]  Nadav Ahituv,et al.  Alternative approach to a heavy weight problem. , 2008, Genome research.

[24]  Tomaso Poggio,et al.  Identification and analysis of alternative splicing events conserved in human and mouse. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  V. Brendel,et al.  Prediction of splice sites in plant pre-mRNA from sequence properties. , 1998, Journal of molecular biology.

[26]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[27]  Rolf Backofen,et al.  Improved identification of conserved cassette exons using Bayesian networks , 2008, BMC Bioinformatics.

[28]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[29]  Martin Vingron,et al.  Genome wide identification and classification of alternative splicing based on EST data , 2004, Bioinform..

[30]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[31]  Guey-Shin Wang,et al.  Splicing in disease: disruption of the splicing code and the decoding machinery , 2007, Nature Reviews Genetics.

[32]  K. Huse,et al.  Non-EST based prediction of exon skipping and intron retention events using Pfam information , 2005, Nucleic acids research.

[33]  G. Ast,et al.  Alternative splicing: current perspectives , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[34]  Yanda Li,et al.  Identification of alternative 5′/3′ splice sites based on the mechanism of splice site competition , 2006, Nucleic acids research.

[35]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..