SWISS-PROT: connecting biomolecular knowledge via a protein database.

With the explosive growth of biological data, the development of new means of data storage was needed. More and more often biological information is no longer published in the conventional way via a publication in a scientific journal, but only deposited into a database. In the last two decades these databases have become essential tools for researchers in biological sciences. Biological databases can be classified according to the type of information they contain. There are basically three types of sequence-related databases (nucleic acid sequences, protein sequences and protein tertiary structures) as well as various specialized data collections. It is important to provide the users of biomolecular databases with a degree of integration between these databases as by nature all of these databases are connected in a scientific sense and each one of them is an important piece to biological complexity. In this review we will highlight our effort in connecting biological information as demonstrated in the SWISS-PROT protein database.

[1]  Amos Bairoch,et al.  The PROSITE database, its status in 1999 , 1999, Nucleic Acids Res..

[2]  Kenneth E. Sanderson,et al.  GENETIC MAP OF SALMONELLA TYPHIMURIUM , 1965 .

[3]  Kara Dolinski,et al.  Saccharomyces Genome Database provides tools to survey gene expression and functional analysis data , 2001, Nucleic Acids Res..

[4]  Kolakowski Lf GCRDB: A G-PROTEIN-COUPLED RECEPTOR DATABASE , 1994 .

[5]  Gilcher Ro Human retroviruses and AIDS. , 1988 .

[6]  J. E. Kranz,et al.  YPD, PombePD and WormPD: model organism volumes of the BioKnowledge library, an integrated resource for protein information. , 2001, Nucleic acids research.

[7]  M. Gerstein How representative are the known structures of the proteins in a complete genome? A comprehensive structural census. , 1998, Folding & design.

[8]  D A Agard,et al.  Three-dimensional structure of the LDL receptor-binding domain of human apolipoprotein E. , 1991, Science.

[9]  Nicolle H. Packer,et al.  GlycoSuiteDB: a new curated relational database of glycoprotein glycan structures and their biological sources , 2001, Nucleic Acids Res..

[10]  J. Breslow,et al.  Synthesis, intracellular processing, and signal peptide of human apolipoprotein E. , 1984, The Journal of biological chemistry.

[11]  Sean Parkin,et al.  Conformational flexibility in the apolipoprotein E amino‐terminal domain structure determined from three new crystal forms: Implications for lipid binding , 2000, Protein science : a publication of the Protein Society.

[12]  Marek S. Skrzypek,et al.  YPDTM, PombePDTM and WormPDTM: model organism volumes of the BioKnowledgeTM Library, an integrated resource for protein information , 2001, Nucleic Acids Res..

[13]  Kenneth E. Rudd,et al.  EcoGene: a genome sequence database for Escherichia coli K-12 , 2000, Nucleic Acids Res..

[14]  亀山 春,et al.  Escherichia coli (K-12) のリン脂質に関する研究(第4報): E. coli (K-12) 無細胞液によるホスファチジン酸の生合成 , 1969 .

[15]  Peter D. Karp,et al.  The EcoCyc and MetaCyc databases , 2000, Nucleic Acids Res..

[16]  Marie-Paule Lefranc,et al.  IMGT , the international ImMunoGeneTics database 1 , 2002 .

[17]  Rodrigo Lopez,et al.  The EMBL Nucleotide Sequence Database , 1999, Nucleic Acids Res..

[18]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[19]  Guy Perrière,et al.  The Enhanced Microbial Genomes Library , 1999, Nucleic Acids Res..

[20]  Steven E. Brenner,et al.  The PRESAGE database for structural genomics , 1999, Nucleic Acids Res..

[21]  M C Peitsch,et al.  The Swiss-3DImage collection and PDB-Browser on the World-Wide Web. , 1995, Trends in biochemical sciences.

[22]  T. N. Bhat,et al.  The PDB data uniformity project , 2001, Nucleic Acids Res..

[23]  I. Moszer The complete genome of Bacillus subtilis: from sequence annotation to data management and analysis , 1998, FEBS letters.

[24]  Tim Berners-Lee,et al.  Weaving The Web: The Original Design And Ultimate Destiny of the World Wide Web , 1999 .

[25]  Carl A. Price,et al.  Mendel, a database of nomenclature for sequenced plant genes , 2001, Nucleic Acids Res..

[26]  Marie-Paule Lefranc,et al.  IMGT, the international ImMunoGeneTics database , 1997, Nucleic Acids Res..

[27]  Morten Østergaard,et al.  Human and mouse proteomic databases: novel resources in the protein universe , 1998, FEBS letters.

[28]  S T Cole,et al.  Learning from the genome sequence of Mycobacterium tuberculosis H37Rv , 1999, FEBS letters.

[29]  Gert Vriend,et al.  Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systems , 2001, Nucleic Acids Res..

[30]  John W. Mellors,et al.  Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences , 1997 .

[31]  Hideaki Sugawara,et al.  DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams , 2000, Nucleic Acids Res..

[32]  K Bock,et al.  The Complex Carbohydrate Structure Database. , 1989, Trends in biochemical sciences.

[33]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[34]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[35]  Alex Bateman,et al.  InterPro: An Integrated Documentation Resource for Protein Families, Domains and Functional Sites , 2002, Briefings Bioinform..

[36]  R. Durbin,et al.  Analysis of protein domain families in Caenorhabditis elegans. , 1997, Genomics.

[37]  K. Rudd,et al.  Genetic map of Salmonella typhimurium, edition VIII. , 1995, Microbiological reviews.

[38]  Richard J. Roberts,et al.  REBASE-restriction enzymes and methylases , 1993, Nucleic Acids Res..

[39]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[40]  Terri K. Attwood,et al.  PRINTS-S: the database formerly known as PRINTS , 2000, Nucleic Acids Res..

[41]  Nathan Linial,et al.  ProtoMap: automatic classification of protein sequences and hierarchy of protein families , 2000, Nucleic Acids Res..

[42]  Shmuel Pietrokovski,et al.  Increased coverage of protein families with the Blocks Database servers , 2000, Nucleic Acids Res..

[43]  Richard J. Roberts,et al.  REBASE-restriction enzymes and methylases , 1997, Nucleic Acids Res..

[44]  Alex Bateman,et al.  InterPro : An integrated documentation resource for protein families , domains and functional sites The InterPro Consortium : , 2005 .

[45]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[46]  W. Gelbart The FlyBase database of the Drosophila Genome Projects and community literature. , 1999, Nucleic acids research.

[47]  Osamu Ohara,et al.  HUGE: a database for human large proteins identified by Kazusa cDNA sequencing project , 1999, Nucleic Acids Res..

[48]  M. Dunn,et al.  Construction of HSC‐2DPAGE: A two‐dimensional gel electrophoresis database of heart proteins , 1997, Electrophoresis.

[49]  Olivier Golaz,et al.  Federated two‐dimensional electrophoresis database: A simple means of publishing two‐dimensional electrophoresis data , 1996, Electrophoresis.

[50]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): integration nexus for the laboratory mouse , 2001, Nucleic Acids Res..

[51]  Guy Perrière,et al.  The non-redundant Bacillus subtilis (NRSub) database: update 1998 , 1998, Nucleic Acids Res..

[52]  Jaime Prilusky,et al.  GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support , 1998, Bioinform..

[53]  Jérôme Gouzy,et al.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons , 2000, Nucleic Acids Res..

[54]  Sarah A. Douglas,et al.  The Zebrafish Information Network (ZFIN): a resource for genetic, genomic and developmental research , 2001, Nucleic Acids Res..

[55]  P Argos,et al.  DOMO: a new database of aligned protein domains. , 1998, Trends in biochemical sciences.

[56]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments and family profiles , 1998, Nucleic Acids Res..

[57]  正木 茂夫,et al.  DNA Data Bank of Japan(DDBJ)利用初心者講習会印象記 , 1988 .

[58]  J I Garrels,et al.  A Saccharomyces cerevisiae Internet protein resource now available , 1995, Electrophoresis.

[59]  F. Neidhardt,et al.  Diagnosis of cellular states of microbial organisms using proteomics , 1999, Electrophoresis.

[60]  Ron D. Appel,et al.  The 1999 SWISS-2DPAGE database update , 2000, Nucleic Acids Res..

[61]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[62]  Daniel Lee,et al.  The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species , 2001, Nucleic Acids Res..

[63]  Sean Parkin,et al.  Novel mechanism for defective receptor binding of apolipoprotein E2 in type III hyperlipoproteinemia , 1996, Nature Structural Biology.