Heterogeneous Database Federation Using Grid Technology for Drug Discovery Process

The rapid progress of biotechnology provides an increasing number of life science databases. These databases have been operated and managed individually on the Internet. Under such a circumstance, it is needed to develop an infrastructure that allows to share information contained in these databases and to conduct research collaboration. Grid technology is an emerging technology for seamless and loose integration of diverse resources distributed on the Internet. In order to achieve federation of the heterogeneous databases, we have developed a system for supporting a drug discovery process using Globus Toolkit3/OGSA-DAI. As an essential part of the system, we introduce a protein-compound interaction search based on a meta-data bridging protein and compound information with their interaction types; such as, inhibitor, agonist, antagonist, etc. The effectiveness of our system is demonstrated by searching for the candidate compounds interacting with the glucocorticoid receptor protein.

[1]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[2]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[3]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[4]  S Date,et al.  An OGSA-based Integration of Life-scientific Resources for Drug Discovery , 2005, Methods of Information in Medicine.

[5]  Philip E. Bourne,et al.  The distribution and query systems of the RCSB Protein Data Bank , 2004, Nucleic Acids Res..

[6]  Pierre Acklin,et al.  Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins , 2003, J. Chem. Inf. Comput. Sci..

[7]  Darren R. Flower,et al.  On the Properties of Bit String-Based Measures of Chemical Similarity , 1998, J. Chem. Inf. Comput. Sci..

[8]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[9]  J. Bajorath,et al.  Database Searching for Compounds with Similar Biological Activity Using Short Binary Bit String Representations of Molecules. , 1999 .

[10]  Jun Xu,et al.  GMA: A Generic Match Algorithm for Structural Homomorphism, Isomorphism, and Maximal Common Substructure Match and Its Applications , 1996, J. Chem. Inf. Comput. Sci..

[11]  I. Foster,et al.  The Physiology of the Grid , 2003 .

[12]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[13]  Ian T. Foster,et al.  Grid Services for Distributed System Integration , 2002, Computer.

[14]  Ruedi Stoop,et al.  An Ontology for Pharmaceutical Ligands and Its Application for in Silico Screening and Library Design , 2002, J. Chem. Inf. Comput. Sci..