DEKOIS: Demanding Evaluation Kits for Objective in Silico Screening - A Versatile Tool for Benchmarking Docking Programs and Scoring Functions

For widely applied in silico screening techniques success depends on the rational selection of an appropriate method. We herein present a fast, versatile, and robust method to construct demanding evaluation kits for objective in silico screening (DEKOIS). This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions helping to save time and resources. We have developed metrics for assessing and improving decoy set quality and employ them to investigate how decoy embedding affects docking. We demonstrate that screening performance is target-dependent and can be impaired by latent actives in the decoy set (LADS) or enhanced by poor decoy embedding. The presented method allows extending and complementing the collection of publicly available high quality decoy sets toward new target space. All present and future DEKOIS data sets will be made accessible at www.dekois.com.

[1]  Anthony Nicholls,et al.  What do we know and when do we know it? , 2008, J. Comput. Aided Mol. Des..

[2]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[3]  Ajay N. Jain,et al.  Recommendations for evaluation of computational methods , 2008, J. Comput. Aided Mol. Des..

[4]  Ajay N. Jain Bias, reporting, and sharing: computational evaluations of docking methods , 2008, J. Comput. Aided Mol. Des..

[5]  Simona Distinto,et al.  Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection—What can we learn from earlier mistakes? , 2008, J. Comput. Aided Mol. Des..

[6]  A. Hopkins,et al.  Navigating chemical space for biology and medicine , 2004, Nature.

[7]  Paul Watson,et al.  Virtual Screening Using Protein-Ligand Docking: Avoiding Artificial Enrichment , 2004, J. Chem. Inf. Model..

[8]  Christopher R. Corbeil,et al.  Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to go , 2008, British journal of pharmacology.

[9]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[10]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[11]  John J. Irwin,et al.  Community benchmarks for virtual screening , 2008, J. Comput. Aided Mol. Des..

[12]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[13]  Christopher W. Murray,et al.  Empirical scoring functions. II. The testing of an empirical scoring function for the prediction of ligand-receptor binding affinities and the use of Bayesian regression to improve the quality of the model , 1998, J. Comput. Aided Mol. Des..

[14]  Ajay N. Jain,et al.  Parameter estimation for scoring protein-ligand interactions using negative training data. , 2006, Journal of medicinal chemistry.

[15]  Richard D. Taylor,et al.  Virtual Screening Using Protein—Ligand Docking: Avoiding Artificial Enrichment. , 2004 .

[16]  Robert D. Clark,et al.  Managing bias in ROC curves , 2008, J. Comput. Aided Mol. Des..

[17]  Brian K. Shoichet,et al.  Virtual screening of chemical libraries , 2004, Nature.

[18]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[19]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[20]  Robert P. Sheridan,et al.  Comparison of Topological, Shape, and Docking Methods in Virtual Screening. , 2007 .

[21]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[22]  Thomas E. Exner,et al.  Influence of Protonation, Tautomeric, and Stereoisomeric States on Protein-Ligand Docking Results , 2009, J. Chem. Inf. Model..

[23]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[24]  G. Klebe,et al.  Approaches to the Description and Prediction of the Binding Affinity of Small-Molecule Ligands to Macromolecular Receptors , 2002 .

[25]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[26]  Gisbert Schneider,et al.  Virtual screening: an endless staircase? , 2010, Nature Reviews Drug Discovery.

[27]  Evan Bolton,et al.  An overview of the PubChem BioAssay resource , 2009, Nucleic Acids Res..

[28]  Andrew C. Good,et al.  Measuring CAMD technique performance: A virtual screening case study in the design of validation experiments , 2004, J. Comput. Aided Mol. Des..

[29]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[30]  D. E. Clark What has virtual screening ever done for drug discovery? , 2008, Expert opinion on drug discovery.

[31]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[32]  Campbell McInnes,et al.  Virtual screening strategies in drug discovery. , 2007, Current opinion in chemical biology.

[33]  J. Irwin,et al.  ZINC ? A Free Database of Commercially Available Compounds for Virtual Screening. , 2005 .

[34]  Pekka Tiikkainen,et al.  Critical Comparison of Virtual Screening Methods against the MUV Data Set , 2009, J. Chem. Inf. Model..

[35]  Knut Baumann,et al.  Impact of Benchmark Data Set Topology on the Validation of Virtual Screening Methods: Exploration and Quantification by Spatial Statistics , 2008, J. Chem. Inf. Model..

[36]  Ajay N. Jain,et al.  Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery , 2008, J. Comput. Aided Mol. Des..

[37]  Sebastian G. Rohrer,et al.  Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data , 2009, J. Chem. Inf. Model..

[38]  A. Hopkins,et al.  The druggable genome , 2002, Nature Reviews Drug Discovery.

[39]  Izhar Wallach,et al.  Virtual Decoy Sets for Molecular Docking Benchmarks , 2011, J. Chem. Inf. Model..

[40]  B. Shoichet,et al.  Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. , 2003, Journal of medicinal chemistry.

[41]  Andrew C. Good,et al.  Measuring CAMD Technique Performance, 2. How "Druglike" Are Drugs? Implications of Random Test Set Selection Exemplified Using Druglikeness Classification Models , 2007, J. Chem. Inf. Model..

[42]  Alexander D. MacKerell,et al.  Consideration of Molecular Weight during Compound Selection in Virtual Target-Based Database Screening , 2003, J. Chem. Inf. Comput. Sci..

[43]  D. Rognan,et al.  Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. , 2000, Journal of medicinal chemistry.

[44]  E. Jaeger,et al.  Comparison of automated docking programs as virtual screening tools. , 2005, Journal of Medicinal Chemistry.

[45]  John W. Liebeschuetz,et al.  Evaluating docking programs: keeping the playing field level , 2008, J. Comput. Aided Mol. Des..

[46]  G. Klebe Virtual ligand screening: strategies, perspectives and limitations , 2006, Drug Discovery Today.

[47]  Frank M Boeckler,et al.  Targeted rescue of a destabilized mutant of p53 by an in silico screened drug , 2008, Proceedings of the National Academy of Sciences.