e-Fungi: a data resource for comparative analysis of fungal genomes

BackgroundThe number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species.DescriptionTo support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows.ConclusionThe e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the genomes stored in the database. The database is accessible at http://www.e-fungi.org.uk, as is the WSDL for the web services.

[1]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[2]  Eugene W. Myers,et al.  Basic local alignment search tool. Journal of Molecular Biology , 1990 .

[3]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[4]  K. Nakai,et al.  PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. , 1999, Trends in biochemical sciences.

[5]  Carole A. Goble,et al.  Conceptual modelling of genomic information , 2000, Bioinform..

[6]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[7]  Ronald W. Davis,et al.  Functional profiling of the Saccharomyces cerevisiae genome , 2002, Nature.

[8]  Ge Gao,et al.  PCAS – a precomputed proteome annotation database resource , 2003, BMC Genomics.

[9]  Andrew Hayes,et al.  GIMS: an integrated data storage and analysis environment for genomic and functional data , 2003, Yeast.

[10]  R. Durbin,et al.  The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics , 2003, PLoS biology.

[11]  Alistair G. Rust,et al.  Ensembl 2002: accommodating comparative genomics , 2003, Nucleic Acids Res..

[12]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[13]  A. White Fungal Genome Initiative , 2003 .

[14]  T. Oinn,et al.  Soaplab - a unified Sesame door to analysis tools , 2003 .

[15]  Mark J. Pallen,et al.  coliBASE: an online database for Escherichia coli, Shigella and Salmonella comparative genomics , 2004, Nucleic Acids Res..

[16]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[17]  Peter D. Karp,et al.  MetaCyc: a multiorganism database of metabolic pathways and enzymes , 2005, Nucleic Acids Res..

[18]  Alain Blanchard,et al.  MolliGen, a database dedicated to the comparative genomics of Mollicutes , 2004, Nucleic Acids Res..

[19]  H. Doddapaneni,et al.  Genome-wide structural and evolutionary analysis of the P450 monooxygenase genes (P450ome) in the white rot fungus Phanerochaete chrysosporium : Evidence for gene duplications and extensive gene clustering , 2005, BMC Genomics.

[20]  B. Dujon,et al.  Genome evolution in yeasts , 2004, Nature.

[21]  Matthew Berriman,et al.  GeneDB: a resource for prokaryotic and eukaryotic organisms , 2004, Nucleic Acids Res..

[22]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[23]  J. Badger,et al.  Comparative analysis of programmed cell death pathways in filamentous fungi , 2005, BMC Genomics.

[24]  Matthew R. Pocock,et al.  A grid-based system for microbial genome comparison and analysis , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[25]  K. Isono,et al.  Genome sequencing and analysis of Aspergillus oryzae , 2005, Nature.

[26]  Makedonka Mitreva,et al.  Comparative genomics of nematodes. , 2005, Trends in genetics : TIG.

[27]  Daniel Nilsson,et al.  Comparative Genomics of Trypanosomatid Parasitic Protozoa , 2005, Science.

[28]  Lincoln Stein,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Res..

[29]  Li-Jun Ma,et al.  Genomics of the fungal kingdom: insights into eukaryotic biology. , 2005, Genome research.

[30]  Tin Wee Tan,et al.  SPdb – a signal peptide database , 2005, BMC Bioinformatics.

[31]  Christina A. Cuomo,et al.  Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae , 2005, Nature.

[32]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[33]  L. Stein What's next for bioinformatics? , 2005 .

[34]  You-Liang Peng,et al.  The dawn of fungal pathogen genomics. , 2006, Annual review of phytopathology.

[35]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[36]  David James Sherman,et al.  Génolevures complete genomes provide data and tools for comparative genomics of hemiascomycetous yeasts , 2005, Nucleic Acids Res..

[37]  Nikos Kyrpides,et al.  The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide , 2005, Nucleic Acids Res..

[38]  Peter F. Hallin,et al.  Ten years of bacterial genome sequencing: comparative-genomics-based discoveries , 2006, Functional & Integrative Genomics.

[39]  Masayuki Machida,et al.  Whole genome comparison of Aspergillus flavus and A. oryzae. , 2006, Medical mycology.

[40]  N. Talbot,et al.  Comparative genomic analysis of phytopathogenic fungi using expressed sequence tag (EST) collections. , 2006, Molecular plant pathology.

[41]  B. Dujon Yeasts illustrate the molecular mechanisms of eukaryotic genome evolution. , 2006, Trends in genetics : TIG.

[42]  Mark J. Pallen,et al.  xBASE, a collection of online databases for bacterial comparative genomics , 2005, Nucleic Acids Res..

[43]  Imre Vastrik,et al.  Reactome: a knowledgebase of biological pathways , 2004, OTM Workshops.

[44]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[45]  Norman W. Paton,et al.  Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it , 2006, BMC Bioinformatics.

[46]  Samuel V. Angiuoli,et al.  Whole genome comparison of the A. fumigatus family. , 2006, Medical mycology.

[47]  Gene Ontology Consortium,et al.  The Gene Ontology (GO) project in 2006 , 2005, Nucleic Acids Res..

[48]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[49]  Inna Dubchak,et al.  The integrated microbial genomes (IMG) system , 2005, Nucleic Acids Res..

[50]  Paul Horton,et al.  PROTEIN SUBCELLULAR LOCALIZATION PREDICTION WITH WOLF PSORT , 2005 .

[51]  J. Boore,et al.  Mitochondrial genome sequences and comparative genomics of Phytophthora ramorum and P. sojae , 2007, Current Genetics.

[52]  Meriel G. Jones,et al.  The first filamentous fungal genome sequences: Aspergillus leads the way for essential everyday resources or dusty museum specimens? , 2007, Microbiology.

[53]  Kara Dolinski,et al.  Expanded protein information at SGD: new pages and proteome browser , 2006, Nucleic Acids Res..

[54]  Marek S. Skrzypek,et al.  Sequence resources at the Candida Genome Database , 2006, Nucleic Acids Res..

[55]  David Haussler,et al.  The UCSC genome browser database: update 2007 , 2006, Nucleic Acids Res..

[56]  Andreas Prlic,et al.  Ensembl 2007 , 2006, Nucleic Acids Res..

[57]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.