The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology

The Gene Ontology Annotation (GOA) database (http://www.ebi.ac.uk/GOA) aims to provide high-quality electronic and manual annotations to the UniProt Knowledgebase (Swiss-Prot, TrEMBL and PIR-PSD) using the standardized vocabulary of the Gene Ontology (GO). As a supplementary archive of GO annotation, GOA promotes a high level of integration of the knowledge represented in UniProt with other databases. This is achieved by converting UniProt annotation into a recognized computational format. GOA provides annotated entries for nearly 60,000 species (GOA-SPTr) and is the largest and most comprehensive open-source contributor of annotations to the GO Consortium annotation effort. By integrating GO annotations from other model organism groups, GOA consolidates specialized knowledge and expertise to ensure the data remain a key reference for up-to-date biological information. Furthermore, the GOA database fully endorses the Human Proteomics Initiative by prioritizing the annotation of proteins likely to benefit human health and disease. In addition to a non-redundant set of annotations to the human proteome (GOA-Human) and monthly releases of its GO annotation for all species (GOA-SPTr), a series of GO mapping files and specific cross-references in other databases are also regularly distributed. GOA can be queried through a simple user-friendly web interface or downloaded in a parsable format via the EBI and GO FTP websites. The GOA data set can be used to enhance the annotation of particular model organism or gene expression data sets, although increasingly it has been used to evaluate GO predictions generated from text mining or protein interaction experiments. In 2004, the GOA team will build on its success and will continue to supplement the functional annotation of UniProt and work towards enhancing the ability of scientists to access all available biological information. Researchers wishing to query or contribute to the GOA project are encouraged to email: goa@ebi.ac.uk.

[1]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[2]  Charles DeLisi,et al.  Functional fingerprints of folds: evidence for correlated structure-function evolution. , 2003, Journal of molecular biology.

[3]  Alistair G. Rust,et al.  Ensembl 2002: accommodating comparative genomics , 2003, Nucleic Acids Res..

[4]  Vincent Lombard,et al.  The EMBL Nucleotide Sequence Database: major new developments , 2003, Nucleic Acids Res..

[5]  Avi Shoshan,et al.  Large-scale protein annotation through gene ontology. , 2002, Genome research.

[6]  Guoying Liu,et al.  NetAffx: Affymetrix probesets and annotations , 2003, Nucleic Acids Res..

[7]  Rolf Apweiler,et al.  Applications of InterPro in Protein Annotation and Genome Analysis , 2002, Briefings Bioinform..

[8]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[9]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[10]  S. Dwight,et al.  Predicting gene function from patterns of annotation. , 2003, Genome research.

[11]  Midori A. Harris,et al.  Standards and Ontologies for Functional Genomics: Towards Unified Ontologies for Biology and Biomedicine , 2003, Comparative and functional genomics.

[12]  James I. Garrels,et al.  Annotating the human proteome: the Human Proteome Survey Database (HumanPSDTM) and an in-depth target database for G protein-coupled receptors (GPCR-PDTM) from Incyte Genomics , 2002, Nucleic Acids Res..

[13]  Mathew W. Wright,et al.  The HUGO Gene Nomenclature Committee (HGNC) , 2001, Human Genetics.

[14]  Alex Bateman,et al.  The InterPro Database, 2003 brings increased coverage and new features , 2003, Nucleic Acids Res..

[15]  Jan Komorowski,et al.  Predicting gene ontology biological process from temporal gene expression patterns. , 2003, Genome research.

[16]  J. Blake,et al.  Extension and integration of the gene ontology (GO): combining GO vocabularies with external vocabularies. , 2002, Genome research.

[17]  Rolf Apweiler,et al.  The EBI SRS server-new features , 2002, Bioinform..

[18]  Rolf Apweiler,et al.  The Proteome Analysis database: a tool for the in silico analysis of whole proteomes , 2003, Nucleic Acids Res..

[19]  Graham Dellaire,et al.  The Nuclear Protein Database (NPD): sub-nuclear localisation and functional annotation of the nuclear proteome , 2003, Nucleic Acids Res..

[20]  Cathy H. Wu,et al.  Protein family classification and functional annotation , 2003, Comput. Biol. Chem..

[21]  D. Barrell,et al.  The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. , 2003, Genome research.

[22]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[23]  Rolf Apweiler,et al.  The Gene Ontology Annotation (GOA) Project—Application of GO in SWISS-PROT, TrEMBL and InterPro , 2003, Comparative and functional genomics.

[24]  J. Schug,et al.  Predicting gene ontology functions from ProDom and CDD protein domains. , 2002, Genome research.

[25]  Sheng Zhong,et al.  ChipInfo: software for extracting gene annotation and gene ontology information for microarray analysis , 2003, Nucleic Acids Res..

[26]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[27]  Anne-Lise Veuthey,et al.  Automated annotation of microbial proteomes in SWISS-PROT , 2003, Comput. Biol. Chem..

[28]  Carole A. Goble,et al.  Ontology-based Knowledge Representation for Bioinformatics , 2000, Briefings Bioinform..

[29]  Sally Goodman,et al.  EU ponders joint action on cancer , 2002, Nature.