SPINE 2: a system for collaborative structural proteomics within a federated database framework.

We present version 2 of the SPINE system for structural proteomics. SPINE is available over the web at http://nesg.org. It serves as the central hub for the Northeast Structural Genomics Consortium, allowing collaborative structural proteomics to be carried out in a distributed fashion. The core of SPINE is a laboratory information management system (LIMS) for key bits of information related to the progress of the consortium in cloning, expressing and purifying proteins and then solving their structures by NMR or X-ray crystallography. Originally, SPINE focused on tracking constructs, but, in its current form, it is able to track target sample tubes and store detailed sample histories. The core database comprises a set of standard relational tables and a data dictionary that form an initial ontology for proteomic properties and provide a framework for large-scale data mining. Moreover, SPINE sits at the center of a federation of interoperable information resources. These can be divided into (i) local resources closely coupled with SPINE that enable it to handle less standardized information (e.g. integrated mailing and publication lists), (ii) other information resources in the NESG consortium that are inter-linked with SPINE (e.g. crystallization LIMS local to particular laboratories) and (iii) international archival resources that SPINE links to and passes on information to (e.g. TargetDB at the PDB).

[1]  I-Min A Chen,et al.  An Overview of the Object-Protocol Model (OPM) and OPM Data Management Tools , 1995, Inf. Syst..

[2]  Steven E. Brenner,et al.  The PRESAGE database for structural genomics , 1999, Nucleic Acids Res..

[3]  Limsoon Wong,et al.  BioKleisli: a digital library for biomedical researchers , 1997, International Journal on Digital Libraries.

[4]  Juancarlos Chan,et al.  WormBase: a cross-species database for comparative genomics , 2003, Nucleic Acids Res..

[5]  T. Bhat,et al.  The Protein Data Bank and the challenge of structural genomics , 2000, Nature Structural Biology.

[6]  Burkhard Rost,et al.  PEP: Predictions for Entire Proteomes , 2003, Nucleic Acids Res..

[7]  Burkhard Rost,et al.  Target space for structural genomics revisited , 2002, Bioinform..

[8]  Gaetano T. Montelione,et al.  SPINS: Standardized ProteIn NMR Storage. A data dictionary and object-oriented relational database for archiving protein NMR spectra , 2002, Journal of biomolecular NMR.

[9]  W G Krebs,et al.  PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information. , 2001, Nucleic acids research.

[10]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[11]  Mark Gerstein,et al.  Structural proteomics of an archaeon , 2000, Nature Structural Biology.

[12]  Stephen K. Burley,et al.  An overview of structural genomics , 2000, Nature Structural Biology.

[13]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[14]  Zukang Feng,et al.  The Protein Data Bank and structural genomics , 2003, Nucleic Acids Res..

[15]  W. M. Westler,et al.  A relational database for sequence-specific protein NMR data , 1991, Journal of biomolecular NMR.

[16]  M. Gerstein,et al.  Ontologies for proteomics: towards a systematic definition of structure and function that scales to the genome level. , 2003, Current opinion in chemical biology.

[17]  Mark Gerstein,et al.  SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics , 2001, Nucleic Acids Res..