Calculating the knowledge-based similarity of functional groups using crystallographic data

A knowledge-based method for calculating the similarity of functional groups is described and validated. The method is based on experimental information derived from small molecule crystal structures. These data are used in the form of scatterplots that show the likelihood of a non-bonded interaction being formed between functional group A (the `central group') and functional group B (the `contact group' or `probe'). The scatterplots are converted into three-dimensional maps that show the propensity of the probe at different positions around the central group. Here we describe how to calculate the similarity of a pair of central groups based on these maps. The similarity method is validated using bioisosteric functional group pairs identified in the Bioster database and Relibase. The Bioster database is a critical compilation of thousands of bioisosteric molecule pairs, including drugs, enzyme inhibitors and agrochemicals. Relibase is an object-oriented database containing structural data about protein-ligand interactions. The distributions of the similarities of the bioisosteric functional group pairs are compared with similarities for all the possible pairs in IsoStar, and are found to be significantly different. Enrichment factors are also calculated showing the similarity method is statistically significantly better than random in predicting bioisosteric functional group pairs.

[1]  J. Thornton,et al.  Amino/aromatic interactions in proteins: is the evidence stacked against hydrogen bonding? , 1994, Journal of molecular biology.

[2]  Andrew C. Good,et al.  Utilization of Gaussian functions for the rapid evaluation of molecular similarity , 1992, J. Chem. Inf. Comput. Sci..

[3]  John H. Van Drie,et al.  Strategies for the determination of pharmacophoric 3D database queries , 1997, J. Comput. Aided Mol. Des..

[4]  G. Klebe The use of composite crystal-field environments in molecular recognition and the de novo design of protein ligands. , 1994, Journal of molecular biology.

[5]  Robin Taylor,et al.  IsoStar: A library of information about nonbonded interactions , 1997, J. Comput. Aided Mol. Des..

[6]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[7]  J M Thornton,et al.  Pi-pi interactions: the geometry and energetics of phenylalanine-phenylalanine interactions in proteins. , 1991, Journal of molecular biology.

[8]  J. Goodfellow,et al.  Solvent interactions with pi ring systems in proteins. , 1995, Protein engineering.

[9]  Stephen D. Pickett,et al.  Diversity Profiling and Design Using 3D Pharmacophores: Pharmacophore-Derived Queries (PDQ) , 1996, J. Chem. Inf. Comput. Sci..

[10]  Ramon Carbo,et al.  How similar is a molecule to another? An electron density measure of similarity between two molecular structures , 1980 .

[11]  D I Stuart,et al.  Crystal structures of HIV-1 reverse transcriptase in complex with carboxanilide derivatives. , 1998, Biochemistry.

[12]  Catherine Burt,et al.  A Linear Molecular Similarity Index , 1992 .

[13]  Owen Johnson,et al.  The development of versions 3 and 4 of the Cambridge Structural Database System , 1991, J. Chem. Inf. Comput. Sci..

[14]  C. Beddell,et al.  The Design of drugs to macromolecular targets , 1992 .

[15]  Robin Taylor,et al.  SuperStar: a knowledge-based approach for identifying interaction sites in proteins. , 1999, Journal of molecular biology.

[16]  Frank H. Allen,et al.  The Nature and Geometry of Intermolecular Interactions between Halogens and Oxygen or Nitrogen , 1996 .

[17]  A. D. Clark,et al.  Structures of Tyr188Leu mutant and wild-type HIV-1 reverse transcriptase complexed with the non-nucleoside inhibitor HBY 097: inhibitor flexibility is a useful design feature for reducing drug resistance. , 1998, Journal of molecular biology.

[18]  Olga Kennard,et al.  Geometry of the imino-carbonyl (N-H...O:C) hydrogen bond. 1. Lone-pair directionality , 1983 .