4.12 – Docking and Scoring

This chapter gives an introduction to small-molecule receptor docking and illustrates basic principles, challenges, potential pitfalls, approaches to improving the results, and successful applications. Small-molecule docking or structure-based virtual screening (VS) essentially consists of two steps: (1) first a small molecule is placed in the receptor, by exploring its external and internal degrees of freedom; (2) once one or more reasonable orientations have been found, the affinity of this molecule for the receptor is predicted, with the ultimate aim of comparing it to other molecules and differentiating good binders from bad binders. Some general considerations and the motivations for performing a VS campaign are outlined in Section 4.12.1 of this chapter. In Section 4.12.2 the innards of three docking programs are described in detail in order to give an idea of how docking works and what the algorithmic approaches are that can be taken. The algorithms selected are representative of a variety of approaches taken in this field, and cover methods such as Monte Carlo sampling, genetic algorithms, whole-ligand docking, and incremental buildup. There are many approaches to predicting the affinity of a molecule for its receptor, this process is generally referred to as scoring. In Section 4.12.3 scoring methods are analyzed in detail and various ways to improve scoring are described: tuning the scoring function to a certain target, designing a function that is firmly rooted in statistical mechanics, or combining several scoring functions. Although docking programs and scoring functions have been validated for general use, before embarking on a docking campaign against a specific target with a specific set of molecules, one should first assess which program and which scoring function (s) work best. Straightforward as this may seem, assessment of the quality of docking tools and scoring functions is fraught with unresolved issues. Section 4.12.4 contains considerations regarding evaluation and assessment and highlights some specific experiences of the authors. Section 4.12.5 describes the main challenges in structure-based VS. One key issue is that most docking programs do not adequately account for the fact that protein targets are flexible and, under physiological conditions, are immersed in water. Approaches to overcome this issue are described. Other improvements relate to accounting for nonspecific binders, increasing the efficiency of docking programs by introducing bias, and improving the quality of results by postprocessing. Several application examples are listed in the penultimate section (Section 4.12.6). Here the emphasis is on illustrating how structure-based VS is applied in modern drug discovery, often in combination with other in silico and experimental techniques. These examples range from high-throughput docking of a large number of compounds to applications where docking is one step in a cascade of in silico applications to winnow down a large set of molecules to a small set of potent binders. Finally we provide a glimpse of the future as we see it.

[1]  S. David Morley,et al.  Validation of an empirical RNA-ligand scoring function for fast flexible docking using RiboDock® , 2004, J. Comput. Aided Mol. Des..

[2]  Gennady Verkhivker,et al.  Deciphering common failures in molecular docking of ligand-protein complexes , 2000, J. Comput. Aided Mol. Des..

[3]  Christopher W. Murray,et al.  The sensitivity of the results of molecular docking to induced fit effects: Application to thrombin, thermolysin and neuraminidase , 1999, J. Comput. Aided Mol. Des..

[4]  G. Klebe,et al.  Statistical potentials and scoring functions applied to protein-ligand binding. , 2001, Current opinion in structural biology.

[5]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[6]  P. Dedon Abstracts, American Chemical Society Division of Chemical Toxicology, 226th ACS National Meeting, New York, New York, September 7−11, 2003 , 2003 .

[7]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[8]  T R Burke,et al.  Structural basis for inhibition of the protein tyrosine phosphatase 1B by phosphotyrosine peptide mimetics. , 1998, Biochemistry.

[9]  Hans-Joachim Böhm,et al.  The computer program LUDI: A new method for the de novo design of enzyme inhibitors , 1992, J. Comput. Aided Mol. Des..

[10]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[11]  Ruben Abagyan,et al.  ICM—A new method for protein modeling and design: Applications to docking and structure prediction from the distorted native conformation , 1994, J. Comput. Chem..

[12]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[13]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[14]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[15]  R. Clark,et al.  Consensus scoring for ligand/protein interactions. , 2002, Journal of molecular graphics & modelling.

[16]  Hans Clevers,et al.  The Xenopus Wnt effector XTcf-3 interacts with Groucho-related transcriptional repressors , 1998, Nature.

[17]  S. Srinivasula,et al.  Structure-based discovery of an organic compound that binds Bcl-2 protein and induces apoptosis of tumor cells. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[18]  B. Shoichet,et al.  Soft docking and multiple receptor conformations in virtual screening. , 2004, Journal of medicinal chemistry.

[19]  Daniel A. Gschwend,et al.  Orientational sampling and rigid-body minimization in molecular docking revisited: On-the-fly optimization and degeneracy removal , 1996, J. Comput. Aided Mol. Des..

[20]  Richard D. Taylor,et al.  Modeling water molecules in protein-ligand docking using GOLD. , 2005, Journal of medicinal chemistry.

[21]  P. Kollman,et al.  Use of MM-PBSA in reproducing the binding free energies to HIV-1 RT of TIBO derivatives and predicting the binding mode to HIV-1 RT of efavirenz by docking and MM-PBSA. , 2001, Journal of the American Chemical Society.

[22]  Diane Joseph-McCarthy,et al.  Pharmacophore‐based molecular docking to account for ligand flexibility , 2003, Proteins.

[23]  Richard D. Taylor,et al.  Improved protein–ligand docking using GOLD , 2003, Proteins.

[24]  S. Teague Implications of protein flexibility for drug discovery , 2003, Nature Reviews Drug Discovery.

[25]  Peter A. Kollman,et al.  A Ligand That Is Predicted to Bind Better to Avidin than Biotin: Insights from Computational Fluorine Scanning , 2000 .

[26]  I. Enyedy,et al.  Discovery of small-molecule inhibitors of Bcl-2 through structure-based computer screening. , 2001, Journal of medicinal chemistry.

[27]  Daniel A. Gschwend,et al.  Analysis and optimization of structure-based virtual screening protocols. (3). New methods and old problems in scoring function design. , 2003, Journal of molecular graphics & modelling.

[28]  Gerhard Klebe,et al.  Successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation. , 2002, Journal of medicinal chemistry.

[29]  P A Kollman,et al.  Structure and thermodynamics of RNA-protein binding: using molecular dynamics and free energy analyses to calculate the free energies of binding and conformational change. , 2000, Journal of molecular biology.

[30]  Rasmus Bro,et al.  On the difference between low-rank and subspace approximation: improved model for multi-linear PLS regression , 2001 .

[31]  Thomas Lengauer,et al.  Flexible docking under pharmacophore type constraints , 2002, J. Comput. Aided Mol. Des..

[32]  Shaomeng Wang,et al.  How Does Consensus Scoring Work for Virtual Library Screening? An Idealized Computer Experiment , 2001, J. Chem. Inf. Comput. Sci..

[33]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[34]  J. Pin,et al.  Virtual screening workflow development guided by the "receiver operating characteristic" curve approach. Application to high-throughput docking on metabotropic glutamate receptor subtype 4. , 2005, Journal of medicinal chemistry.

[35]  Hans Clevers,et al.  Drosophila Tcf and Groucho interact to repress Wingless signalling activity , 1998, Nature.

[36]  G. Vigers,et al.  Multiple active site corrections for docking and virtual screening. , 2004, Journal of medicinal chemistry.

[37]  Jean-Yves Trosset,et al.  Hot Spots in Tcf4 for the Interaction with β-Catenin* , 2003, Journal of Biological Chemistry.

[38]  R. Bro Multiway calibration. Multilinear PLS , 1996 .

[39]  Pieter F. W. Stouten,et al.  A molecular mechanics/grid method for evaluation of ligand–receptor interactions , 1995, J. Comput. Chem..

[40]  Z. Deng,et al.  Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. , 2004, Journal of medicinal chemistry.

[41]  A. Hopkins,et al.  Ligand efficiency: a useful metric for lead selection. , 2004, Drug discovery today.

[42]  Gisbert Schneider,et al.  Virtual screening and fast automated docking methods. , 2002, Drug discovery today.

[43]  Gerhard Klebe,et al.  Subnanomolar Inhibitors from Computer Screening: A Model Study Using Human Carbonic Anhydrase II. , 2001, Angewandte Chemie.

[44]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998 .

[45]  Thomas Lengauer,et al.  FlexE: efficient molecular docking considering protein structure variations. , 2001, Journal of molecular biology.

[46]  Andrew Smellie,et al.  Poling: Promoting conformational variation , 1995, J. Comput. Chem..

[47]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[48]  P. Kollman,et al.  Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models. , 2000, Accounts of chemical research.

[49]  Janet M. Thornton,et al.  BLEEP—potential of mean force describing protein–ligand interactions: II. Calculation of binding energies and comparison with experimental data , 1999 .

[50]  Gareth Jones,et al.  A genetic algorithm for flexible molecular overlay and pharmacophore elucidation , 1995, J. Comput. Aided Mol. Des..

[51]  R. Wade,et al.  Comparative binding energy (COMBINE) analysis of influenza neuraminidase-inhibitor complexes. , 2001, Journal of medicinal chemistry.

[52]  K. Kinzler,et al.  Constitutive Transcriptional Activation by a β-Catenin-Tcf Complex in APC−/− Colon Carcinoma , 1997, Science.

[53]  R. Friesner,et al.  Novel procedure for modeling ligand/receptor induced fit effects. , 2006, Journal of medicinal chemistry.

[54]  Ricardo L. Mancera,et al.  WaterScore: a novel method for distinguishing between bound and displaceable water molecules in the crystal structure of the binding site of protein-ligand complexes , 2003, Journal of molecular modeling.

[55]  P Argos,et al.  Optimal protocol and trajectory visualization for conformational searches of peptides and proteins. , 1992, Journal of molecular biology.

[56]  T Lengauer,et al.  The particle concept: placing discrete water molecules during protein‐ligand docking predictions , 1999, Proteins.

[57]  R Abagyan,et al.  Flexible protein–ligand docking by global energy optimization in internal coordinates , 1997, Proteins.

[58]  D S Lawrence,et al.  Identification of a second aryl phosphate-binding site in protein-tyrosine phosphatase 1B: a paradigm for inhibitor design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[59]  K. Kinzler,et al.  The beta-catenin binding domain of adenomatous polyposis coli is sufficient for tumor suppression. , 2000, Cancer research.

[60]  D. Goodsell,et al.  Automated docking to multiple target structures: Incorporation of protein mobility and structural water heterogeneity in AutoDock , 2002, Proteins.

[61]  Robin Taylor,et al.  Comparing protein–ligand docking programs is difficult , 2005, Proteins.

[62]  Cornel Catana,et al.  Inhibition of protein–protein interactions: The discovery of druglike β‐catenin inhibitors by combining virtual and biophysical screening , 2006, Proteins.

[63]  C. Venkatachalam,et al.  LigScore: a novel scoring function for predicting binding affinities. , 2005, Journal of molecular graphics & modelling.

[64]  B. Kennedy,et al.  Increased insulin sensitivity and obesity resistance in mice lacking the protein tyrosine phosphatase-1B gene. , 1999, Science.

[65]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[66]  Luhua Lai,et al.  SCORE: A New Empirical Method for Estimating the Binding Affinity of a Protein-Ligand Complex , 1998 .

[67]  I. Kuntz,et al.  Hierarchical database screenings for HIV-1 reverse transcriptase using a pharmacophore model, rigid docking, solvation docking, and MM-PB/SA. , 2005, Journal of medicinal chemistry.

[68]  B. Kuhn,et al.  Validation and use of the MM-PBSA approach for drug discovery. , 2005, Journal of medicinal chemistry.

[69]  Thomas Lengauer,et al.  Evaluation of the FLEXX incremental construction algorithm for protein–ligand docking , 1999, Proteins.

[70]  D. Williams,et al.  An analysis of the origins of a cooperative binding energy of dimerization. , 1998, Science.

[71]  W F van Gunsteren,et al.  Decomposition of the free energy of a system in terms of specific interactions. Implications for theoretical and experimental studies. , 1994, Journal of molecular biology.

[72]  Thomas M Frimurer,et al.  Ligand-induced conformational changes: improved predictions of ligand binding conformations and affinities. , 2003, Biophysical journal.

[73]  R. Woods,et al.  Involvement of water in carbohydrate-protein binding. , 2001, Journal of the American Chemical Society.

[74]  Gerhard Klebe,et al.  AFMoC enhances predictivity of 3D QSAR: a case study with DOXP-reductoisomerase. , 2005, Journal of medicinal chemistry.

[75]  E. Shakhnovich,et al.  SMoG: de Novo Design Method Based on Simple, Fast, and Accurate Free Energy Estimates. 1. Methodology and Supporting Evidence , 1996 .

[76]  Brian K Shoichet,et al.  Testing a flexible-receptor docking algorithm in a model binding site. , 2004, Journal of molecular biology.

[77]  Hugo Kubinyi,et al.  Success Stories of Computer‐Aided Design , 2006 .

[78]  Todd J. A. Ewing,et al.  Critical evaluation of search algorithms for automated molecular docking and database screening , 1997 .

[79]  Y. Martin,et al.  A general and fast scoring function for protein-ligand interactions: a simplified potential approach. , 1999, Journal of medicinal chemistry.

[80]  B. Shoichet,et al.  Flexible ligand docking using conformational ensembles , 1998, Protein science : a publication of the Protein Society.

[81]  J. Kuriyan,et al.  The Conformational Plasticity of Protein Kinases , 2002, Cell.

[82]  T. Clackson,et al.  A hot spot of binding energy in a hormone-receptor interface , 1995, Science.

[83]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[84]  Anna Vulpetti,et al.  Novel Scoring Functions Comprising QXP, SASA, and Protein Side-Chain Entropy Terms , 2004, J. Chem. Inf. Model..

[85]  Hans-Joachim Böhm,et al.  LUDI: rule-based automatic design of new substituents for enzyme inhibitor leads , 1992, J. Comput. Aided Mol. Des..

[86]  P. Goodford A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. , 1985, Journal of medicinal chemistry.

[87]  I. Kuntz,et al.  Ligand solvation in molecular docking , 1999, Proteins.

[88]  New optimization method for conformational energy calculations on polypeptides: Conformational space annealing , 1997 .

[89]  Richard A. Lewis,et al.  Lessons in molecular recognition: the effects of ligand and protein flexibility on molecular docking accuracy. , 2004, Journal of medicinal chemistry.

[90]  B. Shoichet,et al.  Molecular docking and high-throughput screening for novel inhibitors of protein tyrosine phosphatase-1B. , 2002, Journal of medicinal chemistry.

[91]  F. Jørgensen,et al.  A new concept for multidimensional selection of ligand conformations (MultiSelect) and multidimensional scoring (MultiScore) of protein-ligand binding affinities. , 2001, Journal of medicinal chemistry.

[92]  Marta Murcia,et al.  Virtual screening with flexible docking and COMBINE-based models. Application to a series of factor Xa inhibitors. , 2004, Journal of medicinal chemistry.

[93]  Meir Glick,et al.  Application of Machine Learning To Improve the Results of High-Throughput Docking Against the HIV-1 Protease , 2004, J. Chem. Inf. Model..

[94]  Patrizia Crivori,et al.  Virtual screening to enrich a compound collection with CDK2 inhibitors using docking, scoring, and composite scoring models , 2005, Proteins.

[95]  R. Glen,et al.  Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. , 1995, Journal of molecular biology.

[96]  Garland R. Marshall,et al.  VALIDATE: A New Method for the Receptor-Based Prediction of Binding Affinities of Novel Ligands , 1996 .

[97]  Hans-Joachim Böhm,et al.  The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure , 1994, J. Comput. Aided Mol. Des..

[98]  R. Wade,et al.  Prediction of drug binding affinities by comparative binding energy analysis , 1995 .

[99]  G. Klebe,et al.  DrugScore meets CoMFA: adaptation of fields for molecular comparison (AFMoC) or how to tailor knowledge-based pair-potentials to a particular protein. , 2002, Journal of medicinal chemistry.

[100]  Colin McMartin,et al.  QXP: Powerful, rapid computer algorithms for structure-based drug design , 1997, J. Comput. Aided Mol. Des..

[101]  J. Wendoloski,et al.  Identification of compounds with nanomolar binding affinity for checkpoint kinase-1 using knowledge-based virtual screening. , 2004, Journal of medicinal chemistry.

[102]  Claudio N. Cavasotto,et al.  Protein flexibility in ligand docking and virtual screening to protein kinases. , 2004, Journal of molecular biology.

[103]  P. Kollman,et al.  Continuum Solvent Studies of the Stability of DNA, RNA, and Phosphoramidate−DNA Helices , 1998 .

[104]  B. Shoichet,et al.  Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. , 2003, Journal of medicinal chemistry.

[105]  M. Murcko,et al.  Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. , 1999, Journal of medicinal chemistry.

[106]  Anna Vulpetti,et al.  Assessment of Docking Poses: Interactions-Based Accuracy Classification (IBAC) versus Crystal Structure Deviations , 2004, J. Chem. Inf. Model..

[107]  T. A. Graham,et al.  Crystal Structure of a β-Catenin/Tcf Complex , 2000, Cell.

[108]  A. N. Jain,et al.  Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. , 1996, Chemistry & biology.

[109]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[110]  K. Sharp,et al.  Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models , 1994 .

[111]  G. Fogliatto,et al.  WaterLOGSY as a method for primary NMR screening: Practical aspects and range of applicability , 2001, Journal of biomolecular NMR.

[112]  Hans-Joachim Böhm,et al.  Prediction of binding constants of protein ligands: A fast method for the prioritization of hits obtained from de novo design or 3D database search programs , 1998, J. Comput. Aided Mol. Des..

[113]  G. Klebe,et al.  Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors. , 2002, Angewandte Chemie.

[114]  Paul Watson,et al.  Virtual Screening Using Protein-Ligand Docking: Avoiding Artificial Enrichment , 2004, J. Chem. Inf. Model..

[115]  G. Klebe The use of composite crystal-field environments in molecular recognition and the de novo design of protein ligands. , 1994, Journal of molecular biology.