Docking Validation Resources: Protein Family and Ligand Flexibility Experiments

A database consisting of 780 ligand-receptor complexes, termed SB2010, has been derived from the Protein Databank to evaluate the accuracy of docking protocols for regenerating bound ligand conformations. The goal is to provide easily accessible community resources for development of improved procedures to aid virtual screening for ligands with a wide range of flexibilities. Three core experiments using the program DOCK, which employ rigid (RGD), fixed anchor (FAD), and flexible (FLX) protocols, were used to gauge performance by several different metrics: (1) global results, (2) ligand flexibility, (3) protein family, and (4) cross-docking. Global spectrum plots of successes and failures vs rmsd reveal well-defined inflection regions, which suggest the commonly used 2 Å criteria is a reasonable choice for defining success. Across all 780 systems, success tracks with the relative difficulty of the calculations: RGD (82.3%) > FAD (78.1%) > FLX (63.8%). In general, failures due to scoring strongly outweigh those due to sampling. Subsets of SB2010 grouped by ligand flexibility (7-or-less, 8-to-15, and 15-plus rotatable bonds) reveal that success degrades linearly for FAD and FLX protocols, in contrast to RGD, which remains constant. Despite the challenges associated with FLX anchor orientation and on-the-fly flexible growth, success rates for the 7-or-less (74.5%) and, in particular, the 8-to-15 (55.2%) subset are encouraging. Poorer results for the very flexible 15-plus set (39.3%) indicate substantial room for improvement. Family-based success appears largely independent of ligand flexibility, suggesting a strong dependence on the binding site environment. For example, zinc-containing proteins are generally problematic, despite moderately flexible ligands. Finally, representative cross-docking examples, for carbonic anhydrase, thermolysin, and neuraminidase families, show the utility of family-based analysis for rapid identification of particularly good or bad docking trends, and the type of failures involved (scoring/sampling), which will likely be of interest to researchers making specific receptor choices for virtual screening. SB2010 is available for download at http://rizzolab.org .

[1]  Renxiao Wang,et al.  The PDBbind database: methodologies and updates. , 2005, Journal of medicinal chemistry.

[2]  Richard D. Taylor,et al.  Improved protein–ligand docking using GOLD , 2003, Proteins.

[3]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[4]  G. Klebe Virtual ligand screening: strategies, perspectives and limitations , 2006, Drug Discovery Today.

[5]  T. Halgren Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 , 1996, J. Comput. Chem..

[6]  Rommie E. Amaro,et al.  Ensemble-Based Virtual Screening Reveals Potential Novel Antiviral Compounds for Avian Influenza Neuraminidase , 2008, Journal of medicinal chemistry.

[7]  Michael M. Mysinger,et al.  Automated Docking Screens: A Feasibility Study , 2009, Journal of medicinal chemistry.

[8]  B. Shoichet,et al.  Soft docking and multiple receptor conformations in virtual screening. , 2004, Journal of medicinal chemistry.

[9]  Ajay N. Jain Bias, reporting, and sharing: computational evaluations of docking methods , 2008, J. Comput. Aided Mol. Des..

[10]  I. Kuntz,et al.  Automated docking with grid‐based energy evaluation , 1992 .

[11]  I. Kuntz Structure-Based Strategies for Drug Design and Discovery , 1992, Science.

[12]  Thomas A. Halgren MMFF VI. MMFF94s option for energy minimization studies , 1999, J. Comput. Chem..

[13]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[14]  R. Friesner,et al.  Novel procedure for modeling ligand/receptor induced fit effects. , 2006, Journal of medicinal chemistry.

[15]  Thomas A. Halgren Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 , 1996, J. Comput. Chem..

[16]  Thomas Lengauer,et al.  Evaluation of the FLEXX incremental construction algorithm for protein–ligand docking , 1999, Proteins.

[17]  I. Kuntz,et al.  DOCK 6: combining techniques to model RNA-small molecule complexes. , 2009, RNA.

[18]  Ajay N. Jain,et al.  Recommendations for evaluation of computational methods , 2008, J. Comput. Aided Mol. Des..

[19]  Holger Gohlke,et al.  The Amber biomolecular simulation programs , 2005, J. Comput. Chem..

[20]  Eamonn F. Healy,et al.  Development and use of quantum mechanical molecular models. 76. AM1: a new general purpose quantum mechanical molecular model , 1985 .

[21]  I. Kuntz,et al.  Using shape complementarity as an initial screen in designing ligands for a receptor binding site of known three-dimensional structure. , 1988, Journal of medicinal chemistry.

[22]  Christopher I. Bayly,et al.  Fast, efficient generation of high‐quality atomic charges. AM1‐BCC model: II. Parameterization and validation , 2002, J. Comput. Chem..

[23]  W. L. Jorgensen The Many Roles of Computation in Drug Discovery , 2004, Science.

[24]  J. Gasteiger,et al.  ITERATIVE PARTIAL EQUALIZATION OF ORBITAL ELECTRONEGATIVITY – A RAPID ACCESS TO ATOMIC CHARGES , 1980 .

[25]  Claudio N. Cavasotto,et al.  Protein flexibility in ligand docking and virtual screening to protein kinases. , 2004, Journal of molecular biology.

[26]  C L Brooks,et al.  Ligand-protein database: linking protein-ligand complex structures to binding data. , 2001, Journal of medicinal chemistry.

[27]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[28]  Rommie E. Amaro,et al.  An improved relaxed complex scheme for receptor flexibility in computer-aided drug design , 2008, J. Comput. Aided Mol. Des..

[29]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[30]  Ricky Chachra,et al.  Origins of Resistance Conferred by the R292K Neuraminidase Mutation via Molecular Dynamics and Free Energy Calculations. , 2008, Journal of chemical theory and computation.

[31]  Irwin D. Kuntz,et al.  Development and validation of a modular, extensible docking program: DOCK 5 , 2006, J. Comput. Aided Mol. Des..

[32]  D. M. Ryan,et al.  Rational design of potent sialidase-based inhibitors of influenza virus replication , 1993, Nature.

[33]  R Nussinov,et al.  Flexible docking allowing induced fit in proteins: Insights from an open to closed conformational isomers , 1998, Proteins.

[34]  Brian K. Shoichet,et al.  Virtual screening of chemical libraries , 2004, Nature.

[35]  John J. Irwin,et al.  Community benchmarks for virtual screening , 2008, J. Comput. Aided Mol. Des..

[36]  I. Kuntz,et al.  Molecular docking to ensembles of protein structures. , 1997, Journal of molecular biology.

[37]  Robert P. Sheridan,et al.  Flexibases: A way to enhance the use of molecular docking methods , 1994, J. Comput. Aided Mol. Des..

[38]  Stephen Hanessian,et al.  A method for induced-fit docking, scoring, and ranking of flexible ligands. Application to peptidic and pseudopeptidic beta-secretase (BACE 1) inhibitors. , 2006, Journal of medicinal chemistry.

[39]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[40]  Paul N. Mortenson,et al.  Diverse, high-quality test set for the validation of protein-ligand docking performance. , 2007, Journal of medicinal chemistry.

[41]  Junmei Wang,et al.  Development and testing of a general amber force field , 2004, J. Comput. Chem..

[42]  L. Hardy,et al.  The impact of structure-guided drug design on clinical agents , 2003 .

[43]  Renxiao Wang,et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. , 2004, Journal of medicinal chemistry.

[44]  J. Stewart Optimization of parameters for semiempirical methods I. Method , 1989 .

[45]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[46]  David J. Stevens,et al.  The structure of H5N1 avian influenza neuraminidase suggests new opportunities for drug design , 2006, Nature.

[47]  J. A. Grant,et al.  Gaussian docking functions. , 2003, Biopolymers.

[48]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[49]  Michael G. Lerner,et al.  Binding MOAD (Mother Of All Databases) , 2005, Proteins.

[50]  A. Liljas,et al.  Structure of native and apo carbonic anhydrase II and structure of some of its anion-ligand complexes. , 1992, Journal of molecular biology.

[51]  Marcel L. Verdonk,et al.  Sensitivity of molecular docking to induced fit effects in influenza virus neuraminidase , 2002, J. Comput. Aided Mol. Des..

[52]  D S Goodsell,et al.  Automated docking of flexible ligands: Applications of autodock , 1996, Journal of molecular recognition : JMR.