Analysis and prediction of gene splice sites in four Aspergillus genomes.

Several Aspergillus fungal genomic sequences have been published, with many more in progress. Obviously, it is essential to have high-quality, consistently annotated sets of proteins from each of the genomes, in order to make meaningful comparisons. We have developed a dedicated, publicly available, splice site prediction program called NetAspGene, for the genus Aspergillus. Gene sequences from Aspergillus fumigatus, the most common mould pathogen, were used to build and test our model. Compared to many animals and plants, Aspergillus contains smaller introns; thus we have applied a larger window size on single local networks for training, to cover both donor and acceptor site information. We have applied NetAspGene to other Aspergilli, including Aspergillus nidulans, Aspergillus oryzae, and Aspergillus niger. Evaluation with independent data sets reveal that NetAspGene performs substantially better splice site prediction than other available tools. NetAspGene will be very helpful for the study in Aspergillus splice sites and especially in alternative splicing. A webpage for NetAspGene is publicly available at http://www.cbs.dtu.dk/services/NetAspGene.

[1]  William H. Majoros,et al.  Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus , 2005, Nature.

[2]  Peter G. Korning,et al.  Splice Site Prediction in Arabidopsis Thaliana Pre-mRNA by Combining Local and Global Sequence Information , 1996 .

[3]  P. Dyer,et al.  From genomics to post-genomics in Aspergillus. , 2004, Current opinion in microbiology.

[4]  J. Latgé,et al.  Molecular Typing of Environmental and Patient Isolates of Aspergillus fumigatus from Various Hospital Settings , 1998, Journal of Clinical Microbiology.

[5]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[6]  R. Guigó,et al.  GeneID in Drosophila. , 2000, Genome research.

[7]  Jaideep P. Sundaram,et al.  Genomic Islands in the Pathogenic Filamentous Fungus Aspergillus fumigatus , 2008, PLoS genetics.

[8]  A. Goffeau Genomics: Multiple moulds , 2005, Nature.

[9]  K. Isono,et al.  Genome sequencing and analysis of Aspergillus oryzae , 2005, Nature.

[10]  J. Latgé,et al.  Aspergillus fumigatus: saprophyte or pathogen? , 2005, Current opinion in microbiology.

[11]  S. Knudsen,et al.  Prediction of human mRNA donor and acceptor sites from the DNA sequence. , 1991, Journal of molecular biology.

[12]  Christina A. Cuomo,et al.  Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae , 2005, Nature.

[13]  P. Sharp The discovery of split genes and RNA splicing. , 2005, Trends in biochemical sciences.

[14]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[15]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[16]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[17]  J. Latgé,et al.  Sequencing the Aspergillus fumigatus genome. , 2002, The Lancet. Infectious diseases.

[18]  Shu‐Ming Li,et al.  Reverse Prenyltransferase in the Biosynthesis of Fumigaclavine C in Aspergillus fumigatus: Gene Expression, Purification, and Characterization of Fumigaclavine C Synthase FGAPT1 , 2006, Chembiochem : a European journal of chemical biology.

[19]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[20]  S. Teutsch,et al.  Burden of aspergillosis-related hospitalizations in the United States. , 2000, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[21]  J. Latgé,et al.  Aspergillus fumigatus and Aspergillosis , 1999, Clinical Microbiology Reviews.

[22]  M. Brent Steady progress and recent breakthroughs in the accuracy of automated genome annotation , 2008, Nature Reviews Genetics.

[23]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[24]  J. A. Roubos,et al.  Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88 , 2007, Nature Biotechnology.