Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database

Docking is one of the most commonly used techniques in drug design. It is used for both identifying correct poses of a ligand in the binding site of a protein as well as for the estimation of the strength of protein–ligand interaction. Because millions of compounds must be screened, before a suitable target for biological testing can be identified, all calculations should be done in a reasonable time frame. Thus, all programs currently in use exploit empirically based algorithms, avoiding systematic search of the conformational space. Similarly, the scoring is done using simple equations, which makes it possible to speed up the entire process. Therefore, docking results have to be verified by subsequent in vitro studies. The purpose of our work was to evaluate seven popular docking programs (Surflex, LigandFit, Glide, GOLD, FlexX, eHiTS, and AutoDock) on the extensive dataset composed of 1300 protein–ligands complexes from PDBbind 2007 database, where experimentally measured binding affinity values were also available. We compared independently the ability of proper posing [according to Root mean square deviation (or Root mean square distance) of predicted conformations versus the corresponding native one] and scoring (by calculating the correlation between docking score and ligand binding strength). To our knowledge, it is the first large‐scale docking evaluation that covers both aspects of docking programs, that is, predicting ligand conformation and calculating the strength of its binding. More than 1000 protein–ligand pairs cover a wide range of different protein families and inhibitor classes. Our results clearly showed that the ligand binding conformation could be identified in most cases by using the existing software, yet we still observed the lack of universal scoring function for all types of molecules and protein families. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2011

[1]  Zhihai Liu,et al.  Comparative Assessment of Scoring Functions on a Diverse Test Set , 2009, J. Chem. Inf. Model..

[2]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998 .

[3]  W Patrick Walters,et al.  Comments on the article "On evaluating molecular-docking methods for pose prediction and enrichment factors". , 2007, Journal of chemical information and modeling.

[4]  C. Venkatachalam,et al.  LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sites. , 2003, Journal of molecular graphics & modelling.

[5]  Ruben Abagyan,et al.  Comparative study of several algorithms for flexible ligand docking , 2003, J. Comput. Aided Mol. Des..

[6]  Charles L. Brooks,et al.  Detailed analysis of grid‐based molecular docking: A case study of CDOCKER—A CHARMm‐based MD docking algorithm , 2003, J. Comput. Chem..

[7]  Marcel L Verdonk,et al.  General and targeted statistical potentials for protein–ligand interactions , 2005, Proteins.

[8]  Maria Kontoyianni,et al.  Evaluation of docking performance: comparative data on docking algorithms. , 2004, Journal of medicinal chemistry.

[9]  Reiji Teramoto,et al.  Consensus Scoring with Feature Selection for Structure-Based Virtual Screening , 2008, J. Chem. Inf. Model..

[10]  Renxiao Wang,et al.  Comparative evaluation of 11 scoring functions for molecular docking. , 2003, Journal of medicinal chemistry.

[11]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[12]  Ruben Abagyan,et al.  ICM—A new method for protein modeling and design: Applications to docking and structure prediction from the distorted native conformation , 1994, J. Comput. Chem..

[13]  Shaomeng Wang,et al.  MCDOCK: A Monte Carlo simulation approach to the molecular docking problem , 1999, J. Comput. Aided Mol. Des..

[14]  Anna Vulpetti,et al.  Assessment of Docking Poses: Interactions-Based Accuracy Classification (IBAC) versus Crystal Structure Deviations , 2004, J. Chem. Inf. Model..

[15]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[16]  Renxiao Wang,et al.  The PDBbind database: methodologies and updates. , 2005, Journal of medicinal chemistry.

[17]  Hans-Joachim Böhm,et al.  The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure , 1994, J. Comput. Aided Mol. Des..

[18]  Sandrine Gerber-Lemaire,et al.  Evaluation of docking programs for predicting binding of Golgi α‐mannosidase II inhibitors: A comparison with crystallography , 2007, Proteins.

[19]  Aniko Simon,et al.  eHiTS: an innovative approach to the docking and scoring function problems. , 2006, Current protein & peptide science.

[20]  I. Kuntz,et al.  Automated docking with grid‐based energy evaluation , 1992 .

[21]  Jin Li,et al.  On Evaluating Molecular-Docking Methods for Pose Prediction and Enrichment Factors , 2006, J. Chem. Inf. Model..

[22]  W Patrick Walters,et al.  A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance , 2004, Proteins.

[23]  J. Hus,et al.  De novo determination of protein structure by NMR using orientational and long-range order restraints. , 2000, Journal of molecular biology.

[24]  S. Yalkowsky,et al.  Comparison of the octanol/water partition coefficients calculated by ClogP, ACDlogP and KowWin to experimentally determined values. , 2005, International journal of pharmaceutics.

[25]  P E Bourne,et al.  The Protein Data Bank. , 2002, Nucleic acids research.

[26]  Jonas Boström,et al.  Assessing the performance of OMEGA with respect to retrieving bioactive conformations. , 2003, Journal of molecular graphics & modelling.

[27]  Gerhard Klebe,et al.  Methodological developments and strategies for a fast flexible superposition of drug-size molecules , 1999, J. Comput. Aided Mol. Des..

[28]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[29]  I. Kuntz,et al.  Molecular docking to ensembles of protein structures. , 1997, Journal of molecular biology.

[30]  Kenji Onodera,et al.  Evaluations of Molecular Docking Programs for Virtual Screening , 2007, J. Chem. Inf. Model..

[31]  Jakub Pas,et al.  Ligand.Info small-molecule Meta-Database. , 2004, Combinatorial chemistry & high throughput screening.

[32]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[33]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[34]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[35]  Zhihai Liu,et al.  Evaluation of the performance of four molecular docking programs on a diverse set of protein‐ligand complexes , 2010, J. Comput. Chem..

[36]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[37]  Ajay N. Jain Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. , 2003, Journal of medicinal chemistry.

[38]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[39]  Didier Rognan,et al.  Comparative evaluation of eight docking tools for docking and virtual screening accuracy , 2004, Proteins.

[40]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[41]  Dima Kozakov,et al.  Convergence and combination of methods in protein-protein docking. , 2009, Current opinion in structural biology.

[42]  Jens Carlsson,et al.  Combining docking, molecular dynamics and the linear interaction energy method to predict binding modes and affinities for non-nucleoside inhibitors to HIV-1 reverse transcriptase. , 2008, Journal of medicinal chemistry.

[43]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[44]  Jonathan B. Chaires,et al.  Molecular Docking of Intercalators and Groove-Binders to Nucleic Acids Using Autodock and Surflex , 2008, J. Chem. Inf. Model..

[45]  H. Wolfson,et al.  Principles of flexible protein–protein docking , 2008, Proteins.

[46]  David W Ritchie,et al.  Recent progress and future directions in protein-protein docking. , 2008, Current protein & peptide science.

[47]  Renxiao Wang,et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. , 2004, Journal of medicinal chemistry.

[48]  Aniko Simon,et al.  eHiTS: a new fast, exhaustive flexible ligand docking system. , 2007, Journal of molecular graphics & modelling.

[49]  Gerhard Klebe,et al.  A fast and efficient method to generate biologically relevant conformations , 1994, J. Comput. Aided Mol. Des..

[50]  A. Ortiz,et al.  Evaluation of docking functions for protein-ligand docking. , 2001, Journal of medicinal chemistry.

[51]  D. Rognan,et al.  Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. , 2000, Journal of medicinal chemistry.

[52]  Hanoch Senderowitz,et al.  SeleX-CS: A New Consensus Scoring Algorithm for Hit Discovery and Lead Optimization , 2009, J. Chem. Inf. Model..

[53]  Shaomeng Wang,et al.  An Extensive Test of 14 Scoring Functions Using the PDBbind Refined Set of 800 Protein-Ligand Complexes , 2004, J. Chem. Inf. Model..