Plant Protein Annotation in the UniProt Knowledgebase1

The Swiss-Prot, TrEMBL, Protein Information Resource (PIR), and DNA Data Bank of Japan (DDBJ) protein database activities have united to form the Universal Protein Resource (UniProt) Consortium. UniProt presents three database layers: the UniProt Archive, the UniProt Knowledgebase (UniProtKB), and the UniProt Reference Clusters. The UniProtKB consists of two sections: UniProtKB/Swiss-Prot (fully manually curated entries) and UniProtKB/TrEMBL (automated annotation, classification and extensive cross-references). New releases are published fortnightly. A specific Plant Proteome Annotation Program (http://www.expasy.org/sprot/ppap/) was initiated to cope with the increasing amount of data produced by the complete sequencing of plant genomes. Through UniProt, our aim is to provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information that will allow the plant community to fully explore and utilize the wealth of information available for both plant and nonplant model organisms.

[1]  T. N. Bhat,et al.  The PDB data uniformity project , 2001, Nucleic Acids Res..

[2]  Robert S. Ledley,et al.  PIRSF: family classification system at the Protein Information Resource , 2004, Nucleic Acids Res..

[3]  L. Stein,et al.  Gramene, a Tool for Grass Genomics , 2002, Plant Physiology.

[4]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[5]  Seung Yon Rhee,et al.  Carpe Diem. Retooling the “Publish or Perish” Model into the “Share and Survive” Model1 , 2004, Plant Physiology.

[6]  Rolf Apweiler,et al.  Functional Information in SWISS-PROT: the Basis for Large-scale Characterisation of Protein Sequences , 2001, Briefings Bioinform..

[7]  Zhilei Chen,et al.  A highly sensitive selection method for directed evolution of homing endonucleases , 2005, Nucleic acids research.

[8]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[9]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[10]  Qunfeng Dong,et al.  MaizeGDB, the community database for maize genetics and genomics , 2004, Nucleic Acids Res..

[11]  Rolf Apweiler,et al.  Automatic rule generation for protein annotation with the C4.5 data mining algorithm applied on SWISS-PROT , 2001, Bioinform..

[12]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[13]  Zhang-Zhi Hu,et al.  The iProClass integrated database for protein functional analysis , 2004, Comput. Biol. Chem..

[14]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[15]  R. Overbeek,et al.  Missing genes in metabolic pathways: a comparative genomics approach. , 2003, Current opinion in chemical biology.

[16]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[17]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[18]  M. Zivy,et al.  The maize two-dimensional gel protein database: towards an integrated genome analysis program , 1996, Theoretical and Applied Genetics.

[19]  Amos Bairoch,et al.  NEWT, a new taxonomy portal , 2003, Nucleic Acids Res..

[20]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[21]  Alex Bateman,et al.  The InterPro Database, 2003 brings increased coverage and new features , 2003, Nucleic Acids Res..

[22]  Ron D. Appel,et al.  The 1999 SWISS-2DPAGE database update , 2000, Nucleic Acids Res..

[23]  Amos Bairoch,et al.  Swiss-Prot: Juggling between evolution and stability , 2004, Briefings Bioinform..

[24]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[25]  Arnaud Couloux,et al.  GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts , 2004, Nucleic Acids Res..

[26]  Rolf Apweiler,et al.  VARSPLIC: alternatively-spliced protein sequences derived from SWISS-PROT and TrEMBL , 2000, Bioinform..

[27]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[28]  Amos Bairoch,et al.  Recent improvements to the PROSITE database , 2004, Nucleic Acids Res..

[29]  J. D. Tardós,et al.  Publish or Perish , 1987 .

[30]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[31]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.