Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking

A key metric to assess molecular docking remains ligand enrichment against challenging decoys. Whereas the directory of useful decoys (DUD) has been widely used, clear areas for optimization have emerged. Here we describe an improved benchmarking set that includes more diverse targets such as GPCRs and ion channels, totaling 102 proteins with 22886 clustered ligands drawn from ChEMBL, each with 50 property-matched decoys drawn from ZINC. To ensure chemotype diversity, we cluster each target’s ligands by their Bemis–Murcko atomic frameworks. We add net charge to the matched physicochemical properties and include only the most dissimilar decoys, by topology, from the ligands. An online automated tool (http://decoys.docking.org) generates these improved matched decoys for user-supplied ligands. We test this data set by docking all 102 targets, using the results to improve the balance between ligand desolvation and electrostatics in DOCK 3.6. The complete DUD-E benchmarking set is freely available at http://dude.docking.org.

[1]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[2]  I. Kuntz,et al.  The maximal affinity of ligands. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[3]  D. Rognan,et al.  Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. , 2000, Journal of medicinal chemistry.

[4]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[5]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[6]  Gerhard Klebe,et al.  Successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation. , 2002, Journal of medicinal chemistry.

[7]  G. Klebe,et al.  Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors. , 2002, Angewandte Chemie.

[8]  Brian K Shoichet,et al.  Structure-based discovery of a novel, noncovalent inhibitor of AmpC beta-lactamase. , 2002, Structure.

[9]  Brian K. Shoichet,et al.  Structure-Based Discovery of a Novel, Noncovalent Inhibitor of AmpC β-Lactamase , 2002 .

[10]  Sameer Velankar,et al.  E-MSD: an integrated data resource for bioinformatics , 2004, Nucleic Acids Res..

[11]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[12]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[13]  Paul Watson,et al.  Virtual Screening Using Protein-Ligand Docking: Avoiding Artificial Enrichment , 2004, J. Chem. Inf. Model..

[14]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[15]  Didier Rognan,et al.  Comparative evaluation of eight docking tools for docking and virtual screening accuracy , 2004, Proteins.

[16]  Maria Paola Costi,et al.  Structure-Based Optimization of a Non-β-lactam Lead Results in Inhibitors That Do Not Up-Regulate β-Lactamase Expression in Cell Culture , 2005 .

[17]  Maria Paola Costi,et al.  Structure-based optimization of a non-beta-lactam lead results in inhibitors that do not up-regulate beta-lactamase expression in cell culture. , 2005, Journal of the American Chemical Society.

[18]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[19]  B. Shoichet,et al.  Decoys for docking. , 2005, Journal of medicinal chemistry.

[20]  Ajay N. Jain,et al.  Parameter estimation for scoring protein-ligand interactions using negative training data. , 2006, Journal of medicinal chemistry.

[21]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[22]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[23]  Ajay N. Jain Bias, reporting, and sharing: computational evaluations of docking methods , 2008, J. Comput. Aided Mol. Des..

[24]  John J. Irwin,et al.  Community benchmarks for virtual screening , 2008, J. Comput. Aided Mol. Des..

[25]  Istvan J. Enyedy,et al.  Can we use docking and scoring for hit-to-lead optimization? , 2008, J. Comput. Aided Mol. Des..

[26]  A. Nicholls,et al.  How to do an evaluation: pitfalls and traps , 2008, J. Comput. Aided Mol. Des..

[27]  Maria Paola Costi,et al.  Comprehensive mechanistic analysis of hits from high-throughput and docking screens against beta-lactamase. , 2008, Journal of medicinal chemistry.

[28]  Ajay N. Jain,et al.  Recommendations for evaluation of computational methods , 2008, J. Comput. Aided Mol. Des..

[29]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[30]  Peter Kolb,et al.  Structure-based discovery of β2-adrenergic receptor ligands , 2009, Proceedings of the National Academy of Sciences.

[31]  Sebastian G. Rohrer,et al.  Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data , 2009, J. Chem. Inf. Model..

[32]  James L. Melville,et al.  Better than Random? The Chemotype Enrichment Problem , 2009, J. Chem. Inf. Model..

[33]  Michael M. Mysinger,et al.  Automated Docking Screens: A Feasibility Study , 2009, Journal of medicinal chemistry.

[34]  Denise G. Teotico,et al.  Docking for fragment inhibitors of AmpC β-lactamase , 2009, Proceedings of the National Academy of Sciences.

[35]  Gerhard Klebe,et al.  Molecular Docking Screens Using Comparative Models of Proteins , 2009, J. Chem. Inf. Model..

[36]  Michael J. Keiser,et al.  Complementarity Between a Docking and a High-Throughput Screen in Discovering New Cruzain Inhibitors† , 2010, Journal of medicinal chemistry.

[37]  Brian K. Shoichet,et al.  Structure-Based Discovery of A2A Adenosine Receptor Ligands , 2010, Journal of medicinal chemistry.

[38]  Benjamin A. Ellingson,et al.  Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database , 2010, J. Chem. Inf. Model..

[39]  Brian K. Shoichet,et al.  Rapid Context-Dependent Ligand Desolvation in Molecular Docking , 2010, J. Chem. Inf. Model..

[40]  Avner Schlessinger,et al.  Ligand Discovery from a Dopamine D3 Receptor Homology Model and Crystal Structure , 2011, Nature chemical biology.

[41]  Holger Claussen,et al.  Substantial improvements in large-scale redocking and screening using the novel HYDE scoring function , 2012, Journal of Computer-Aided Molecular Design.

[42]  Frank M. Boeckler,et al.  DEKOIS: Demanding Evaluation Kits for Objective in Silico Screening - A Versatile Tool for Benchmarking Docking Programs and Scoring Functions , 2011, J. Chem. Inf. Model..

[43]  Anne Mai Wassermann,et al.  REPROVIS-DB: A Benchmark System for Ligand-Based Virtual Screening Derived from Reproducible Prospective Applications , 2011, J. Chem. Inf. Model..

[44]  Izhar Wallach,et al.  Virtual Decoy Sets for Molecular Docking Benchmarks , 2011, J. Chem. Inf. Model..

[45]  Niu Huang,et al.  How to benchmark methods for structure-based virtual screening of large compound libraries. , 2012, Methods in molecular biology.

[46]  Oliver Korb,et al.  Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test , 2012, Journal of Computer-Aided Molecular Design.

[47]  Michael M. Mysinger,et al.  Structure-based ligand discovery for the protein–protein interface of chemokine receptor CXCR4 , 2012, Proceedings of the National Academy of Sciences.

[48]  Richard A. Friesner,et al.  Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide , 2012, Journal of Computer-Aided Molecular Design.

[49]  Sudipto Mukherjee,et al.  Evaluation of DOCK 6 as a pose generation and database enrichment tool , 2012, Journal of Computer-Aided Molecular Design.

[50]  Fedor N. Novikov,et al.  Lead Finder docking and virtual screening evaluation with Astex and DUD test sets , 2012, Journal of Computer-Aided Molecular Design.

[51]  Adrià Cereto-Massagué,et al.  DecoyFinder: an easy-to-use python GUI application for building target-specific decoy sets , 2012, Bioinform..

[52]  Ruben Abagyan,et al.  Docking and scoring with ICM: the benchmarking results and strategies for improvement , 2012, Journal of Computer-Aided Molecular Design.

[53]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[54]  Ajay N. Jain,et al.  Surflex-Dock: Docking benchmarks and real-world application , 2012, Journal of Computer-Aided Molecular Design.

[55]  Claudio N. Cavasotto,et al.  Ligand and Decoy Sets for Docking to G Protein-Coupled Receptors , 2012, J. Chem. Inf. Model..