The Gene Ontology Annotation (GOA) Project—Application of GO in SWISS-PROT, TrEMBL and InterPro

As proteomics research gains momentum, biologists need new ways to access and analyse information on proteins. Many new gene products, from a wide range of species, are being added to the SWISS-PROT Protein Knowledgebase — the world’s most highly annotated protein sequence database — and its supplement, TrEMBL [3]. To fully exploit the potential of these data, the SWISSPROT group at EBI aims to capture all the available biological information related to these sequences and especially components of the human proteome. One important challenge in this endeavour is to make all our databases describe, in a consistent way, what each protein does.

[1]  Rolf Apweiler,et al.  CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins , 2001, Nucleic Acids Res..

[2]  Peer Bork,et al.  Recent improvements to the SMART domain-based sequence annotation resource , 2002, Nucleic Acids Res..

[3]  J. Schug,et al.  Predicting gene ontology functions from ProDom and CDD protein domains. , 2002, Genome research.

[4]  T. N. Bhat,et al.  The Protein Data Bank: unifying the archive , 2002, Nucleic Acids Res..

[5]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[6]  J A Blake,et al.  Program description: Strategies for biological annotation of mammalian systems: implementing gene ontologies in mouse genome informatics. , 2001, Genomics.

[7]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[8]  Rolf Apweiler,et al.  Proteome Analysis Database , 2004 .

[9]  D. Shields,et al.  The evolution of haematopoietic cytokine/receptor complexes. , 1995, Cytokine.

[10]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[11]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[12]  Rolf Apweiler,et al.  The EBI SRS Server: Recent Developments , 2002, German Conference on Bioinformatics.

[13]  Rolf Apweiler,et al.  Functional Information in SWISS-PROT: the Basis for Large-scale Characterisation of Protein Sequences , 2001, Briefings Bioinform..

[14]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[15]  Jérôme Gouzy,et al.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons , 2000, Nucleic Acids Res..

[16]  Amos Bairoch,et al.  PROSITE: A Documented Database Using Patterns and Profiles as Motif Descriptors , 2002, Briefings Bioinform..

[17]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[18]  Sally Goodman,et al.  EU ponders joint action on cancer , 2002, Nature.

[19]  Rolf Apweiler,et al.  TEMBLOR – Perspectives of EBI Database Services , 2002, Comparative and functional genomics.

[20]  Rolf Apweiler,et al.  A novel method for automatic functional annotation of proteins , 1999, Bioinform..

[21]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments and family profiles , 1998, Nucleic Acids Res..

[22]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[23]  Rolf Apweiler,et al.  Applications of InterPro in Protein Annotation and Genome Analysis , 2002, Briefings Bioinform..

[24]  Terri K. Attwood,et al.  PRINTS and PRINTS-S shed light on protein ancestry , 2002, Nucleic Acids Res..

[25]  P. Familletti,et al.  Cloning and expression of murine IL-12. , 1992, Journal of immunology.

[26]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): the model organism database for the laboratory mouse , 2002, Nucleic Acids Res..

[27]  Fan Yang,et al.  TIGRFAMs: a protein family resource for the functional identification of proteins , 2001, Nucleic Acids Res..

[28]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[29]  Vincent Lombard,et al.  The EMBL Nucleotide Sequence Database: major new developments , 2003, Nucleic Acids Res..

[30]  Alex Bateman,et al.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites , 2001, Nucleic Acids Res..

[31]  Rolf Apweiler,et al.  Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes , 2001, Nucleic Acids Res..