WebScipio: reconstructing alternative splice variants of eukaryotic proteins

Accurate exon–intron structures are essential prerequisites in genomics, proteomics and for many protein family and single gene studies. We originally developed Scipio and the corresponding web service WebScipio for the reconstruction of gene structures based on protein sequences and available genome assemblies. WebScipio also allows predicting mutually exclusive spliced exons and tandemly arrayed gene duplicates. The obtained gene structures are illustrated in graphical schemes and can be analysed down to the nucleotide level. The set of eukaryotic genomes available at the WebScipio server is updated on a daily basis. The current version of the web server provides access to ∼3400 genome assembly files of >1100 sequenced eukaryotic species. Here, we have also extended the functionality by adding a module with which expressed sequence tag (EST) and cDNA data can be mapped to the reconstructed gene structure for the identification of all types of alternative splice variants. WebScipio has a user-friendly web interface, and we believe that the improved web server will provide better service to biologists interested in the gene structure corresponding to their protein of interest, including all types of alternative splice forms and tandem gene duplicates. WebScipio is freely available at http://www.webscipio.org.

[1]  V. Solovyev,et al.  Automatic annotation of eukaryotic genes, pseudogenes and promoters , 2006, Genome Biology.

[2]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[3]  Florian Odronitz,et al.  diArk – a resource for eukaryotic genome research , 2007, BMC Genomics.

[4]  Holger Pillmann,et al.  WebScipio: An online tool for the determination of gene structures using protein sequences , 2008, BMC Genomics.

[5]  Michael R Brent,et al.  Using N‐SCAN or TWINSCAN to Predict Gene Structures in Genomic DNA Sequences , 2007, Current protocols in bioinformatics.

[6]  Joseph T. Roland,et al.  Alternative Splicing in Class V Myosins Determines Association with Rab10* , 2009, Journal of Biological Chemistry.

[7]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[8]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[9]  K. Hatje,et al.  Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology , 2011, BMC Bioinformatics.

[10]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[11]  C. Burge,et al.  Computational inference of homologous gene structures in the human genome. , 2001, Genome research.

[12]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[13]  Mark Borodovsky,et al.  Eukaryotic Gene Prediction Using GeneMark.hmm , 2003, Current protocols in bioinformatics.

[14]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[15]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[16]  Florian Odronitz,et al.  Scipio: Using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species , 2008, BMC Bioinformatics.

[17]  Florian Odronitz,et al.  diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data , 2011, BMC Research Notes.

[18]  R. Drysdale FlyBase : a database for the Drosophila research community. , 2008, Methods in molecular biology.

[19]  M. Yandell,et al.  A beginner's guide to eukaryotic genome annotation , 2012, Nature Reviews Genetics.

[20]  Klas Hatje,et al.  Predicting Tandemly Arrayed Gene Duplicates with WebScipio , 2011 .

[21]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[22]  Cheng Soon Ong,et al.  mGene: accurate SVM-based gene finding with an application to nematode genomes. , 2009, Genome research.

[23]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[24]  Mark Borodovsky,et al.  Eukaryotic Gene Prediction Using GeneMark.hmm‐E and GeneMark‐ES , 2011, Current protocols in bioinformatics.

[25]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[26]  K. Hatje,et al.  Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio , 2011, BMC Research Notes.

[27]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.