TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

The TBestDB database contains approximately 370,000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact tbestdb@bch.umontreal.ca. The database can be queried at http://tbestdb.bcm.umontreal.ca/.

[1]  Li Li,et al.  PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data , 2003, Nucleic Acids Res..

[2]  Masahide Sasaki,et al.  Full-malaria 2004: an enlarged database for comparative studies of full-length cDNAs of malaria parasites, Plasmodium species , 2004, Nucleic Acids Res..

[3]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[4]  Dilip Arora,et al.  Handbook of fungal biotechnology , 2003 .

[5]  Palmer,et al.  Phylogeny of early land plants: insights from genes and genomes. , 1999, Trends in plant science.

[6]  Michael W. Gray,et al.  The Frequency of Eubacterium-to-Eukaryote Lateral Gene Transfers Shows Significant Cross-Taxa Variation Within Amoebozoa , 2006, Journal of Molecular Evolution.

[7]  Michel Casse,et al.  Origin and evolution of the elements , 1993 .

[8]  William H. Majoros,et al.  Gene discovery in the Acanthamoeba castellanii genome. , 2005, Protist.

[9]  Elarbi Badidi,et al.  AnaBench: a Web/CORBA-based workbench for biomolecular sequence analysis , 2003, BMC Bioinformatics.

[10]  Naiara Rodríguez-Ezpeleta,et al.  Fungal Evolution Meets Fungal Genomics , 2003 .

[11]  Li Li,et al.  ToxoDB: accessing the Toxoplasma gondii genome , 2003, Nucleic Acids Res..

[12]  Patrick J. Keeling,et al.  Nucleus-Encoded Genes for Plastid-Targeted Proteins in Helicosporidium: Functional Diversity of a Cryptic Plastid in a Parasitic Alga , 2004, Eukaryotic Cell.

[13]  D. Barr,et al.  An outline for the reclassification of the Chytridiales, and for a new order, the Spizellomycetales , 1980 .

[14]  T. Borza,et al.  Multiple Metabolic Roles for the Nonphotosynthetic Plastid of the Green Alga Prototheca wickerhamii , 2005, Eukaryotic Cell.

[15]  Eric M. Just,et al.  dictyBase, the model organism database for Dictyostelium discoideum , 2005, Nucleic Acids Res..

[16]  B F Lang,et al.  Evolution of monoblepharidalean fungi based on complete mitochondrial genome sequences. , 2003, Nucleic acids research.

[17]  Philippa Rhodes,et al.  CryptoDB: a Cryptosporidium bioinformatics resource update , 2005, Nucleic Acids Res..

[18]  Bindu Gajria,et al.  PlasmoDB: The Plasmodium Genome Resource , 2005 .

[19]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[20]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[21]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[22]  Guy Perrière,et al.  The European ribosomal RNA database , 2004, Nucleic Acids Res..

[23]  Naiara Rodríguez-Ezpeleta,et al.  Monophyly of Primary Photosynthetic Eukaryotes: Green Plants, Red Algae, and Glaucophytes , 2005, Current Biology.

[24]  Matthew Berriman,et al.  GeneDB: a resource for prokaryotic and eukaryotic organisms , 2004, Nucleic Acids Res..

[25]  B. Lang,et al.  The Closest Unicellular Relatives of Animals , 2002, Current Biology.

[26]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[27]  Charles F. Delwiche,et al.  The Closest Living Relatives of Land Plants , 2001, Science.

[28]  B Franz Lang,et al.  Mitochondria of protists. , 2004, Annual review of genetics.

[29]  Y. Nakamura,et al.  Generation of 10,154 expressed sequence tags from a leafy gametophyte of a marine red alga, Porphyra yezoensis. , 2000, DNA research : an international journal for rapid publication of reports on genes and genomes.

[30]  Peter D. Karp,et al.  The Pathway Tools software , 2002, ISMB.

[31]  S. Stickel,et al.  Monophyletic origins of the metazoa: an evolutionary link with fungi , 1993, Science.

[32]  Peter D. Karp,et al.  The MetaCyc Database , 2002, Nucleic Acids Res..

[33]  Timothy Y. James,et al.  Molecular phylogenetics of the Chytridiomycota supports the utility of ultrastructural data in chytrid systematics , 2000 .

[34]  Jessica C. Kissinger,et al.  TcruziDB: an integrated, post-genomics community resource for Trypanosoma cruzi , 2005, Nucleic Acids Res..

[35]  Li Li,et al.  ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites , 2004, Nucleic Acids Res..

[36]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[37]  Johannes Goll,et al.  The Diatom EST Database , 2004, Nucleic Acids Res..

[38]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[39]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[40]  Gertraud Burger,et al.  AutoFACT: An Automatic Functional Annotation and Classification Tool , 2005, BMC Bioinformatics.