Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes

The SWISS-PROT group at EBI has developed the Proteome Analysis Database utilising existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archaea and eukaryotes (http://www.ebi.ac. uk/proteome/). The two main projects used, InterPro and CluSTr, give a new perspective on families, domains and sites and cover 31-67% (InterPro statistics) of the proteins from each of the complete genomes. CluSTr covers the three complete eukaryotic genomes and the incomplete human genome data. The Proteome Analysis Database is accompanied by a program that has been designed to carry out InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.

[1]  Rolf Apweiler,et al.  CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins , 2001, Nucleic Acids Res..

[2]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..

[3]  A Tsugita [Protein sequence databases and the related databases]. , 1994, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[4]  Jean-Christophe Aude,et al.  Significance of Z-value Statistics of Smith-Waterman Scores for Protein Alignments , 1999, Comput. Chem..

[5]  Natalia Maltsev,et al.  WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction , 2000, Nucleic Acids Res..

[6]  Jérôme Gouzy,et al.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons , 2000, Nucleic Acids Res..

[7]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[8]  Alex Bateman,et al.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites , 2001, Nucleic Acids Res..

[9]  M. R. Adams,et al.  Comparative genomics of the eukaryotes. , 2000, Science.

[10]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[11]  Terri K. Attwood,et al.  PRINTS-S: the database formerly known as PRINTS , 2000, Nucleic Acids Res..

[12]  Chris Sander,et al.  Protein folds and families: sequence and structure alignments , 1999, Nucleic Acids Res..

[13]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[14]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[17]  Amos Bairoch,et al.  The PROSITE database, its status in 1999 , 1999, Nucleic Acids Res..