The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000

SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDSs) in the EMBL Nucleotide Sequence Database, except the CDSs already included in SWISS-PROT. We also describe the Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT. SWISS-PROT is available at: http://www.expasy.ch/sprot/ and http://www.ebi.ac.uk/swissprot/

[1]  Biological Laboratories Divinity Avenue Cambridge Ma Usa. FlyBase FlyBase: a Drosophila database. , 1998, Nucleic acids research.

[2]  Thure Etzold,et al.  SRS - an indexing and retrieval tool for flat file data libraries , 1993, Comput. Appl. Biosci..

[3]  Nathan Linial,et al.  A Map of the Protein Space: An Automatic Hierarchical Classification of all Protein Sequences , 1998, ISMB.

[4]  Amos Bairoch The ENZYME data bank in 1995 , 1996, Nucleic Acids Res..

[5]  R D Appel,et al.  A new generation of information retrieval tools for biologists: the example of the ExPASy WWW server. , 1994, Trends in biochemical sciences.

[6]  Jérôme Gouzy,et al.  The ProDom database of protein domain families , 1998, Nucleic Acids Res..

[7]  Jérôme Gracy,et al.  Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment , 1998, Bioinform..

[8]  Jean-Jacques Codani,et al.  LASSAP, a LArge Scale Sequence compArison Package , 1997, Comput. Appl. Biosci..

[9]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..

[10]  Sean R. Eddy,et al.  Pfam: multiple sequence alignments and HMM-profiles of protein domains , 1998, Nucleic Acids Res..

[11]  D. Brutlag,et al.  Highly specific protein sequence motifs for genome analysis. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): a community resource. Status and enhancements. The Mouse Genome Informatics Group , 1998, Nucleic Acids Res..

[13]  J Prilusky,et al.  Rapid access to biomedical knowledge with GeneCards and HotMolecBase: Implications for the electrophoretic analysis of large sets of gene products , 1997, Electrophoresis.