Demand-Driven Database Integration for Biomolecular Applications

As a member of the consortium for the "Computation and Prediction of Receptor−Ligand Interaction" the Integrated Publication and Information Systems Institute, GMD−IPSI , Darmstadt, participates in the national joint project RELIWE. Docking−D is the part of RELIWE which considers heterogeneous database support and in which GMD−IPSI takes the leading role. In the current situation the receptor and ligand data used within the project, either raw data or data derived during analysis, is extremely heterogeneous. Many of these databases are supported by autonomous systems which employ different data management facilities with heterogeneous data models, in particular dedicated file systems with specialized retrieval and presentation functionality (e.g. PDB [1]) or a relational model (e.g. Whatif [20]). In addition, the information is represented at different levels of detail (e.g. sequence vs. structural data), with mutual inconsistencies in structure, naming, scaling, and behavior, whereby much of this behavior is hidden in the implementation of the autonomous systems. Thus the database system must enable integrated access to the underlying, autonomous, heterogeneous information bases, but also has to allow the integration of new datatypes (e.g. sequence and spatial data) and has to support associative retrieval of the data. Different tools, like receptor−ligand docking algorithms, model building tools for receptors or visualization tools, which are developed or provided by the other partners within the project (e.g. Whatif, LUDI [2]), must be connected to the DBMS.

[1]  Hans-Joachim Böhm,et al.  The computer program LUDI: A new method for the de novo design of enzyme inhibitors , 1992, J. Comput. Aided Mol. Des..

[2]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[3]  Erich J. Neuhold,et al.  Object-Oriented Modeling for Hypermedia Systems Using the VODAK Model Language , 1993, NATO ASI OODBS.

[4]  G J Kemp,et al.  An object-oriented database for protein structure analysis. , 1990, Protein engineering.

[5]  Erich J. Neuhold,et al.  Database integration using the open object-oriented database system VODAK , 1995 .

[6]  Ali R. Hurson,et al.  A taxonomy and current issues in multidatabase systems , 1992, Computer.

[7]  Erich J. Neuhold,et al.  Semantic vs. structural resemblance of classes , 1991, SGMD.

[8]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[9]  Michael Schrefl,et al.  Object class definition by generalization using upward inheritance , 1988, Proceedings. Fourth International Conference on Data Engineering.

[10]  M J Sternberg,et al.  A relational database of protein structures designed for flexible enquiries about conformation. , 1989, Protein engineering.

[11]  Erich J. Neuhold,et al.  ViewSystem: integrating heterogeneous information bases by object-oriented views , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[12]  S J Wodak,et al.  Sesam: A relational database for structure and sequence of macromolecules , 1991, Proteins.

[13]  Michael Schrefl,et al.  Dynamic Derivation of Personalized Views , 1988, VLDB.

[14]  G Vriend,et al.  A novel search method for protein sequence--structure relations using property profiles. , 1994, Protein engineering.

[15]  Karl Aberer,et al.  Integrating relational and object-oriented database systems using a metaclass concept , 1994, J. Syst. Integr..

[16]  Nick Roussopoulos,et al.  Interoperability of multiple autonomous databases , 1990, CSUR.

[17]  Gerhard Weikum,et al.  Semantic concurrency control in object-oriented database systems , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.