A computational investigation of kinetoplastid trans-splicing

Trans-splicing is an unusual process in which two separate RNA strands are spliced together to yield a mature mRNA. We present a novel computational approach which has an overall accuracy of 82% and can predict 92% of known trans-splicing sites. We have applied our method to chromosomes 1 and 3 of Leishmania major, with high-confidence predictions for 85% and 88% of annotated genes respectively. We suggest some extensions of our method to other systems.

[1]  Alan K. Mackworth,et al.  Evaluation of gene-finding programs on mammalian sequences. , 2001, Genome research.

[2]  S. Beverley,et al.  Evolution of nuclear ribosomal RNAs in kinetoplastid protozoa: perspectives on the age and origins of parasitism. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[3]  S. Graham Mechanisms of stage-regulated gene expression in kinetoplastida. , 1995, Parasitology today.

[4]  P. Sharp,et al.  Splicing of precursors to mRNAs by the spliceosomes , 1993 .

[5]  Anders Krogh,et al.  Two Methods for Improving Performance of a HMM and their Application for Gene Finding , 1997, ISMB.

[6]  Larry Wall,et al.  Programming Perl , 1991 .

[7]  R Braun,et al.  Control of polyadenylation and alternative splicing of transcripts from adjacent genes in a procyclin expression site: a dual role for polypyrimidine tracts in trypanosomes? , 1994, Nucleic acids research.

[8]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[9]  W. Mendenhall,et al.  A Second Course in Statistics: Regression Analysis , 1996 .

[10]  Jean Thierry-Mieg,et al.  A global analysis of Caenorhabditis elegans operons , 2002, Nature.

[11]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[12]  Paul Schliekelman,et al.  Statistical Methods in Bioinformatics: An Introduction , 2001 .

[13]  L. Vanhamme,et al.  Control of gene expression in trypanosomes. , 1995, Microbiological reviews.

[14]  S. Sunkin,et al.  Leishmania major Friedlin chromosome 1 has an unusual distribution of protein-coding genes. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[15]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[16]  Paul Barry,et al.  Programming Perl 3rd Edition , 2000 .

[17]  P. Myler,et al.  Genomic organization and gene expression in a chromosomal region of Leishmania major. , 2004, Molecular and biochemical parasitology.

[18]  Heather J Munden,et al.  The Genome of the Kinetoplastid Parasite, Leishmania major , 2005, Science.

[19]  E. Ullu,et al.  Exonic Sequences in the 5′ Untranslated Region of α-Tubulin mRNA Modulate trans Splicing inTrypanosoma brucei , 1998, Molecular and Cellular Biology.

[20]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .

[21]  Kenneth Stuart,et al.  Transcription of Leishmania major Friedlin chromosome 1 initiates in both directions within a single region. , 2003, Molecular cell.

[22]  S. Salzberg,et al.  Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[23]  James W. Fickett,et al.  The Gene Identification Problem: An Overview for Developers , 1995, Comput. Chem..

[24]  William N. Venables,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[25]  R. Pelle,et al.  Stage-specific differential polyadenylation of mini-exon derived RNA in African trypanosomes. , 1993, Molecular and biochemical parasitology.

[26]  H. Eisen,et al.  Alternate trans splicing in Trypanosoma equiperdum: implications for splice site selection , 1988, Molecular and cellular biology.

[27]  K. Hastings SL trans-splicing: easy come or easy go? , 2005, Trends in genetics : TIG.

[28]  S. Lips,et al.  Alternative splicing within and between alleles of the ATPase gene 1 locus of Trypanosoma brucei. , 1993, Molecular and biochemical parasitology.

[29]  A. Djikeng,et al.  A new twist in trypanosome RNA metabolism: cis-splicing of pre-mRNA. , 2000, RNA.

[30]  Calvin L. Williams,et al.  Modern Applied Statistics with S-Plus , 1997 .

[31]  J. Donelson,et al.  Differential expression of two mRNAs from a single gene encoding an HMG1-like DNA binding protein of African trypanosomes. , 1992, Molecular and biochemical parasitology.

[32]  G. Cross,et al.  Systematic Study of Sequence Motifs for RNA trans Splicing in Trypanosoma brucei , 2005, Molecular and Cellular Biology.

[33]  B. Séraphin,et al.  The spliceosomal snRNP core complex of Trypanosoma brucei: cloning and functional analysis reveals seven Sm protein constituents. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[34]  V. Bellofatto,et al.  Differential response to RNA trans-splicing signals within the phosphoglycerate kinase gene cluster in Trypanosoma brucei. , 1993, Nucleic acids research.

[35]  L. V. D. van der Ploeg,et al.  Requirement of a polypyrimidine tract for trans‐splicing in trypanosomes: discriminating the PARP promoter from the immediately adjacent 3′ splice acceptor site. , 1991, The EMBO journal.

[36]  W R Pearson,et al.  Using the FASTA program to search protein and DNA sequence databases. , 1994, Methods in molecular biology.

[37]  E. Ullu,et al.  A common pyrimidine-rich motif governs trans-splicing and polyadenylation of tubulin polycistronic pre-mRNA in trypanosomes. , 1994, Genes & development.

[38]  T. Gaasterland,et al.  An organism-specific method to rank predicted coding regions in Trypanosoma brucei. , 2003, Nucleic acids research.

[39]  B. Ganem RNA world , 1987, Nature.

[40]  E. Ullu,et al.  Temporal order of RNA-processing reactions in trypanosomes , 1993 .

[41]  Shulamit Michaeli,et al.  MINIREVIEWS trans and cis Splicing in Trypanosomatids : Mechanism , Factors , and Regulation , 2003 .

[42]  S. Beverley,et al.  Coupling of poly(A) site selection and trans-splicing in Leishmania. , 1993, Genes & development.

[43]  Reza Salavati,et al.  Leishmania major chromosome 3 contains two long convergent polycistronic gene clusters separated by a tRNA gene. , 2003, Nucleic acids research.

[44]  M G Lee,et al.  Transcription of protein-coding genes in trypanosomes by RNA polymerase I. , 1997, Annual review of microbiology.

[45]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[46]  Michael Ruogu Zhang,et al.  Identification of protein coding regions in the human genome by quadratic discriminant analysis. , 1997, Proceedings of the National Academy of Sciences of the United States of America.