Community benchmarks for virtual screening

Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands physically, so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands physically but not topologically to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chemical space, and the proper scope for using DUD. Careful attention to both the composition of benchmarks and how they are used is essential to avoid being misled by overfitting and bias.

[1]  Li Xing,et al.  Evaluation and application of multiple scoring functions for a virtual screening experiment , 2004, J. Comput. Aided Mol. Des..

[2]  Richard D. Taylor,et al.  Improved protein–ligand docking using GOLD , 2003, Proteins.

[3]  Jean-Louis Reymond,et al.  Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery , 2007, J. Chem. Inf. Model..

[4]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[5]  B. Shoichet,et al.  Decoys for docking. , 2005, Journal of medicinal chemistry.

[6]  Robert P. Sheridan,et al.  FLOG: A system to select ‘quasi-flexible’ ligands complementary to a receptor of known three-dimensional structure , 1994, J. Comput. Aided Mol. Des..

[7]  W Patrick Walters,et al.  A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance , 2004, Proteins.

[8]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[9]  Thilo Stehle,et al.  Crystal Structure of the Extracellular Segment of Integrin αVβ3 in Complex with an Arg-Gly-Asp Ligand , 2002, Science.

[10]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[11]  A. Brunger Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. , 1992 .

[12]  Tudor I. Oprea,et al.  Pursuing the leadlikeness concept in pharmaceutical research. , 2004, Current opinion in chemical biology.

[13]  Shaomeng Wang,et al.  An Extensive Test of 14 Scoring Functions Using the PDBbind Refined Set of 800 Protein-Ligand Complexes , 2004, J. Chem. Inf. Model..

[14]  B. Shoichet,et al.  Soft docking and multiple receptor conformations in virtual screening. , 2004, Journal of medicinal chemistry.

[15]  R Abagyan,et al.  Abagyan R., Totrov M., Kuznetsov D. ICM-A new Method for protein Modeling and Design.Applications to Docking and structure Prediction from the Distorded native Conformation. 1994 J. Comp. Chem. 15, 488-506 , 2007 .

[16]  Didier Rognan,et al.  Comparative evaluation of eight docking tools for docking and virtual screening accuracy , 2004, Proteins.

[17]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[18]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[19]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[20]  Kenji Onodera,et al.  Evaluations of Molecular Docking Programs for Virtual Screening , 2007, J. Chem. Inf. Model..

[21]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[22]  A. Brünger Free R value: a novel statistical quantity for assessing the accuracy of crystal structures , 1992, Nature.

[23]  Ruben Abagyan,et al.  ICM—A new method for protein modeling and design: Applications to docking and structure prediction from the distorted native conformation , 1994, J. Comput. Chem..

[24]  Gerard J Kleywegt,et al.  Separating model optimization and model validation in statistical cross-validation as applied to crystallography. , 2007, Acta crystallographica. Section D, Biological crystallography.

[25]  D. Rognan,et al.  Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. , 2000, Journal of medicinal chemistry.

[26]  Gisbert Schneider,et al.  Support vector machine applications in bioinformatics. , 2003, Applied bioinformatics.

[27]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[28]  S. Yohannan,et al.  Crystallographic study of the tetrabutylammonium block to the KcsA K+ channel. , 2007, Journal of molecular biology.

[29]  I. Kuntz,et al.  The maximal affinity of ligands. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[30]  I. Kuntz,et al.  Automated docking with grid‐based energy evaluation , 1992 .

[31]  Junwei Zhang,et al.  Development of KiBank, a database supporting structure-based drug design , 2004, Comput. Biol. Chem..

[32]  Richard D. Smith,et al.  Binding MOAD, a high-quality protein–ligand database , 2007, Nucleic Acids Res..

[33]  R. Stevens,et al.  High-Resolution Crystal Structure of an Engineered Human β2-Adrenergic G Protein–Coupled Receptor , 2007, Science.

[34]  Paul N. Mortenson,et al.  Diverse, high-quality test set for the validation of protein-ligand docking performance. , 2007, Journal of medicinal chemistry.

[35]  Richard D. Taylor,et al.  Virtual Screening Using Protein—Ligand Docking: Avoiding Artificial Enrichment. , 2004 .

[36]  J. A. Grant,et al.  Gaussian docking functions. , 2003, Biopolymers.

[37]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[38]  Richard A. Friesner,et al.  Comparative Performance of Several Flexible Docking Programs and Scoring Functions: Enrichment Studies for a Diverse Set of Pharmaceutically Relevant Targets. , 2007 .

[39]  Maria Kontoyianni,et al.  Evaluation of docking performance: comparative data on docking algorithms. , 2004, Journal of medicinal chemistry.

[40]  John H. Van Drie,et al.  Pharmacophore Discovery - Lessons Learned , 2003 .

[41]  Ajay N. Jain,et al.  Parameter estimation for scoring protein-ligand interactions using negative training data. , 2006, Journal of medicinal chemistry.

[42]  Robin Taylor,et al.  A new test set for validating predictions of protein–ligand interaction , 2002, Proteins.

[43]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.