BioMolQuest: integrated database-based retrieval of protein structural and functional information

MOTIVATION Information about a particular protein or protein family is usually distributed among multiple databases and often in more than one entry in each database. Retrieval and organization of this information can be a laborious task. This task is complicated even further by the existence of alternative terms for the same concept. RESULTS The PDB, SWISS-PROT, ENZYME, and CATH databases have been imported into a combined relational database, BIOMOLQUEST: A powerful search engine has been built using this database as a back end. The search engine achieves significant improvements in query performance by automatically utilizing cross-references between the legacy databases. The results of the queries are presented in an organized, hierarchical way.

[1]  Larry Wall,et al.  Programming Perl , 1991 .

[2]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[3]  Alex Bateman,et al.  InterPro: An Integrated Documentation Resource for Protein Families, Domains and Functional Sites , 2002, Briefings Bioinform..

[4]  Akinori Sarai,et al.  3DinSight: An Integrated Database and Search Tool for Structure, Function and Property of Biomolecules , 1998 .

[5]  M. Kanehisa,et al.  DBGET/LinkDB: an integrated database retrieval system. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[6]  M. Kanehisa Post-Genome Informatics , 2000 .

[7]  Rodrigo Lopez,et al.  The EMBL Nucleotide Sequence Database , 1999, Nucleic Acids Res..

[8]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[9]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[10]  Frances M. G. Pearl,et al.  Protein folds, functions and evolution. , 1999, Journal of molecular biology.

[11]  C. Branden,et al.  Introduction to protein structure , 1991 .

[12]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[13]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[14]  T. Creighton Proteins: Structures and Molecular Properties , 1986 .

[15]  Motonori Ota,et al.  The Protein Mutant Database , 1999, Nucleic Acids Res..

[16]  R. Norton,et al.  Structure and structure-function relationships of sea anemone proteins that interact with the sodium channel. , 1991, Toxicon : official journal of the International Society on Toxinology.

[17]  M. Delepierre,et al.  Scorpion toxins specific for Na+-channels. , 1999, European journal of biochemistry.

[18]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[19]  J. Davies,et al.  Fungal ribotoxins: a family of naturally engineered targeted toxins? , 1995, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[20]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..

[21]  Akinori Sarai,et al.  3DinSight: an integrated relational database and search tool for the structure, function and properties of biomolecules , 1998, Bioinform..

[22]  Susumu Goto,et al.  LIGAND: chemical database of enzyme reactions , 2000, Nucleic Acids Res..

[23]  J. Skolnick,et al.  Method for prediction of protein function from sequence using the sequence-to-structure-to-function paradigm with application to glutaredoxins/thioredoxins and T1 ribonucleases. , 1998, Journal of molecular biology.

[24]  K. Polyakov,et al.  Comparison of active sites of some microbial ribonucleases: structural basis for guanylic specificity. , 1990, Trends in biochemical sciences.

[25]  Dennis Murray,et al.  Data warehousing in the real world - a practical guide for building decision support systems , 1997 .

[26]  Peter B. McGarvey,et al.  The Protein Information Resource (PIR) , 2000, Nucleic Acids Res..

[27]  Amos Bairoch,et al.  The PROSITE database, its status in 1999 , 1999, Nucleic Acids Res..

[28]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[29]  P. Argos,et al.  SRS: information retrieval system for molecular biology data banks. , 1996, Methods in enzymology.