Knowledge-based scoring function to predict protein-ligand interactions.

The development and validation of a new knowledge-based scoring function (DrugScore) to describe the binding geometry of ligands in proteins is presented. It discriminates efficiently between well-docked ligand binding modes (root-mean-square deviation <2.0 A with respect to a crystallographically determined reference complex) and those largely deviating from the native structure, e.g. generated by computer docking programs. Structural information is extracted from crystallographically determined protein-ligand complexes using ReLiBase and converted into distance-dependent pair-preferences and solvent-accessible surface (SAS) dependent singlet preferences for protein and ligand atoms. Definition of an appropriate reference state and accounting for inaccuracies inherently present in experimental data is required to achieve good predictive power. The sum of the pair preferences and the singlet preferences is calculated based on the 3D structure of protein-ligand binding modes generated by docking tools. For two test sets of 91 and 68 protein-ligand complexes, taken from the Protein Data Bank (PDB), the calculated score recognizes poses generated by FlexX deviating <2 A from the crystal structure on rank 1 in three quarters of all possible cases. Compared to FlexX, this is a substantial improvement. For ligand geometries generated by DOCK, DrugScore is superior to the "chemical scoring" implemented into this tool, while comparable results are obtained using the "energy scoring" in DOCK. None of the presently known scoring functions achieves comparable power to extract binding modes in agreement with experiment. It is fast to compute, regards implicitly solvation and entropy contributions and produces correctly the geometry of directional interactions. Small deviations in the 3D structure are tolerated and, since only contacts to non-hydrogen atoms are regarded, it is independent from assumptions of protonation states.

[1]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[2]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[3]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[4]  G A Petsko,et al.  Aromatic-aromatic interaction: a mechanism of protein structure stabilization. , 1985, Science.

[5]  J. Dunitz,et al.  Can statistical analysis of structural parameters from different crystal environments lead to quantitative energy relationships , 1988 .

[6]  R. Cramer,et al.  Validation of the general purpose tripos 5.2 force field , 1989 .

[7]  D. Beveridge,et al.  Free energy via molecular simulation: applications to chemical and biomolecular systems. , 1989, Annual review of biophysics and biophysical chemistry.

[8]  G. Casari,et al.  Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. , 1990, Journal of molecular biology.

[9]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[10]  B Honig,et al.  Extracting hydrophobic free energies from experimental data: relationship to protein folding and theoretical models. , 1991, Biochemistry.

[11]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[12]  A. Warshel,et al.  Electrostatic energy and macromolecular function. , 1991, Annual review of biophysics and biophysical chemistry.

[13]  Owen Johnson,et al.  The development of versions 3 and 4 of the Cambridge Structural Database System , 1991, J. Chem. Inf. Comput. Sci..

[14]  Hans-Joachim Böhm,et al.  The computer program LUDI: A new method for the de novo design of enzyme inhibitors , 1992, J. Comput. Aided Mol. Des..

[15]  Jeanmarie Guenot,et al.  Variability of conformations at crystal contacts in BPTI represent true low‐energy structures: Correspondence among lattice packing and molecular dynamics structures , 1992, Proteins.

[16]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[17]  J C Sacchettini,et al.  Escherichia coli-derived rat intestinal fatty acid binding protein with bound myristate at 1.5 A resolution and I-FABPArg106-->Gln with bound oleate at 1.74 A resolution. , 1994, The Journal of biological chemistry.

[18]  A. Ben-Naim Solvation Thermodynamics of Biopolymers , 1993 .

[19]  M J Sternberg,et al.  Empirical scale of side-chain conformational entropy in protein folding. , 1993, Journal of molecular biology.

[20]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[21]  Wilfried Blokzijl,et al.  Hydrophobe Effekte – Ansichten und Tatsachen , 1993 .

[22]  Peter A. Kollman,et al.  FREE ENERGY CALCULATIONS : APPLICATIONS TO CHEMICAL AND BIOCHEMICAL PHENOMENA , 1993 .

[23]  S. Wodak,et al.  Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. , 1994, Journal of molecular biology.

[24]  I. Kuntz,et al.  Structure-Based Molecular Design , 1994 .

[25]  Hans-Joachim Böhm,et al.  The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure , 1994, J. Comput. Aided Mol. Des..

[26]  P. Koehl,et al.  Polar and nonpolar atomic environments in the protein core: Implications for folding and binding , 1994, Proteins.

[27]  A. Finkelstein,et al.  Perfect temperature for protein structure prediction and folding , 1995, Proteins.

[28]  M J Sippl,et al.  Knowledge-based potentials for proteins. , 1995, Current opinion in structural biology.

[29]  B. Honig,et al.  Classical electrostatics in biology and chemistry. , 1995, Science.

[30]  Gennady M Verkhivker,et al.  Empirical free energy calculations of ligand-protein crystallographic complexes. I. Knowledge-based ligand-protein interaction potentials applied to the prediction of human immunodeficiency virus 1 protease binding affinity. , 1995, Protein engineering.

[31]  A. Godzik,et al.  Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets , 1995, Protein science : a publication of the Protein Society.

[32]  R L Jernigan,et al.  A preference‐based free‐energy parameterization of enzyme‐inhibitor binding. Applications to HIV‐1‐protease inhibitor design , 1995, Protein science : a publication of the Protein Society.

[33]  E. Shakhnovich,et al.  SMoG: de Novo Design Method Based on Simple, Fast, and Accurate Free Energy Estimates. 1. Methodology and Supporting Evidence , 1996 .

[34]  A Godzik,et al.  Knowledge-based potentials for protein folding: what can we learn from known protein structures? , 1996, Structure.

[35]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[36]  C. Fehr Enantioselective Protonation of Enolates and Enols , 1996 .

[37]  Ajay N. Jain Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities , 1996, J. Comput. Aided Mol. Des..

[38]  Gerhard Klebe,et al.  What Can We Learn from Molecular Recognition in Protein–Ligand Complexes for the Design of New Drugs? , 1996 .

[39]  D. Covell,et al.  Docking enzyme‐inhibitor complexes using a preference‐based free‐energy surface , 1996, Proteins.

[40]  Thomas Lengauer,et al.  Computational methods for biomolecular docking. , 1996, Current opinion in structural biology.

[41]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[42]  R. Jernigan,et al.  Structure-derived potentials and protein simulations. , 1996, Current opinion in structural biology.

[43]  P. Kollman Advances and Continuing Challenges in Achieving Realistic and Predictive Simulations of the Properties of Organic and Biological Molecules , 1996 .

[44]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[45]  Garland R. Marshall,et al.  VALIDATE: A New Method for the Receptor-Based Prediction of Binding Affinities of Novel Ligands , 1996 .

[46]  Todd J. A. Ewing,et al.  Critical evaluation of search algorithms for automated molecular docking and database screening , 1997, J. Comput. Chem..

[47]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[48]  R. Jernigan,et al.  Inter-residue potentials in globular proteins and the dominance of highly specific hydrophilic interactions at close separation. , 1997, Journal of molecular biology.

[49]  Irwin D. Kuntz,et al.  Automated flexible ligand docking method and its application for database search , 1997 .

[50]  S Vajda,et al.  Empirical potentials and functions for protein folding and binding. , 1997, Current opinion in structural biology.

[51]  K A Dill,et al.  Additivity Principles in Biochemistry* , 1997, The Journal of Biological Chemistry.

[52]  A. Godzik,et al.  Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? , 1997, Protein science : a publication of the Protein Society.

[53]  J. S. Dixon,et al.  Evaluation of the CASP2 docking section , 1997, Proteins.

[54]  A E Torda,et al.  Perspectives in protein-fold recognition. , 1997, Current opinion in structural biology.

[55]  R Nussinov,et al.  A set of van der Waals and coulombic radii of protein atoms for molecular and solvent‐accessible surface calculation, packing evaluation, and docking , 1998, Proteins.

[56]  M Stahl,et al.  Development of filter functions for protein-ligand docking. , 1998, Journal of molecular graphics & modelling.

[57]  John H. Van Drie,et al.  Approaches to virtual library design , 1998 .

[58]  M Hendlich,et al.  Databases for protein-ligand complexes. , 1998, Acta crystallographica. Section D, Biological crystallography.

[59]  H. Kubinyi Structure-based design of enzyme inhibitors and receptor ligands. , 1998, Current opinion in drug discovery & development.

[60]  Jonas Boström,et al.  Conformational energy penalties of protein-bound ligands , 1998, J. Comput. Aided Mol. Des..

[61]  W A Koppensteiner,et al.  Knowledge-based potentials--back to the roots. , 1998, Biochemistry. Biokhimiia.

[62]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[63]  Robin Taylor,et al.  IsoStar: A library of information about nonbonded interactions , 1997, J. Comput. Aided Mol. Des..

[64]  Christopher W. Murray,et al.  Empirical scoring functions. II. The testing of an empirical scoring function for the prediction of ligand-receptor binding affinities and the use of Bayesian regression to improve the quality of the model , 1998, J. Comput. Aided Mol. Des..

[65]  Hans-Joachim Böhm,et al.  Prediction of binding constants of protein ligands: A fast method for the prioritization of hits obtained from de novo design or 3D database search programs , 1998, J. Comput. Aided Mol. Des..

[66]  Hans-Jörg Schneider,et al.  Supramolecular Chemistry, Part 85[+] Flexibility, Association Constants, and Salt Effects in Organic Ion Pairs: How Single Bonds Affect Molecular Recognition , 1999 .

[67]  Peter D. J. Grootenhuis,et al.  Comparison of two implementations of the incremental construction algorithm in flexible docking of thrombin inhibitors , 1999, J. Comput. Aided Mol. Des..

[68]  Y. Martin,et al.  A general and fast scoring function for protein-ligand interactions: a simplified potential approach. , 1999, Journal of medicinal chemistry.

[69]  Robin Taylor,et al.  SuperStar: a knowledge-based approach for identifying interaction sites in proteins. , 1999, Journal of molecular biology.

[70]  R. Jernigan,et al.  Self‐consistent estimation of inter‐residue protein contact energies based on an equilibrium mixture approximation of residues , 1999, Proteins.

[71]  A. Davis,et al.  Hydrogen Bonding, Hydrophobic Interactions, and Failure of the Rigid Receptor Hypothesis , 1999 .

[72]  Janet M. Thornton,et al.  BLEEP—potential of mean force describing protein–ligand interactions: I. Generating potential , 1999 .

[73]  Janet M. Thornton,et al.  BLEEP - potential of mean force describing protein-ligand interactions: II. Calculation of binding energies and comparison with experimental data , 1999, J. Comput. Chem..

[74]  P. Hajduk,et al.  Evaluation of PMF scoring in docking weak ligands to the FK506 binding protein. , 1999, Journal of medicinal chemistry.

[75]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..