PEDANT covers all complete RefSeq genomes

The PEDANT genome database provides exhaustive annotation of nearly 3000 publicly available eukaryotic, eubacterial, archaeal and viral genomes with more than 4.5 million proteins by a broad set of bioinformatics algorithms. In particular, all completely sequenced genomes from the NCBI's Reference Sequence collection (RefSeq) are covered. The PEDANT processing pipeline has been sped up by an order of magnitude through the utilization of precalculated similarity information stored in the similarity matrix of proteins (SIMAP) database, making it possible to process newly sequenced genomes immediately as they become available. PEDANT is freely accessible to academic users at http://pedant.gsf.de. For programmatic access Web Services are available at http://pedant.gsf.de/webservices.jsp.

[1]  Gerhard Adam,et al.  FGDB: a comprehensive fungal genome resource on the plant pathogen Fusarium graminearum , 2005, Nucleic Acids Res..

[2]  Rolf Apweiler,et al.  InterProScan: protein domains identifier , 2005, Nucleic Acids Res..

[3]  R. Staden A strategy of DNA sequencing employing computer programs. , 1979, Nucleic acids research.

[4]  Andreas Prlic,et al.  Ensembl 2007 , 2006, Nucleic Acids Res..

[5]  James G. R. Gilbert,et al.  The vertebrate genome annotation (Vega) database , 2004, Nucleic Acids Res..

[6]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[7]  Guang R. Gao,et al.  An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes , 2005, Bioinform..

[8]  I-Min A. Chen,et al.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata , 2007, Nucleic Acids Res..

[9]  Nikos Kyrpides,et al.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata , 2007, Nucleic Acids Res..

[10]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[11]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[12]  Thomas Rattei,et al.  SIMAP—structuring the network of protein similarities , 2007, Nucleic Acids Res..

[13]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[14]  Dmitrij Frishman,et al.  PEDANTic genome analysis , 1997 .

[15]  David P. Anderson,et al.  Using Public Resource Computing and Systematic Pre-calculation for Large Scale Sequence Analysis , 2006, GCCB.

[16]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[17]  Sarah Calvo,et al.  Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis , 2006, Nature.