Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock

Abstract Prediction of the bound configuration of small-molecule ligands that differ substantially from the cognate ligand of a protein co-crystal structure is much more challenging than re-docking the cognate ligand. Success rates for cross-docking in the range of 20–30 % are common. We present an approach that uses structural information known prior to a particular cutoff-date to make predictions on ligands whose bounds structures were determined later. The knowledge-guided docking protocol was tested on a set of ten protein targets using a total of 949 ligands. The benchmark data set, called PINC (“PINC Is Not Cognate”), is publicly available. Protein pocket similarity was used to choose representative structures for ensemble-docking. The docking protocol made use of known ligand poses prior to the cutoff-date, both to help guide the configurational search and to adjust the rank of predicted poses. Overall, the top-scoring pose family was correct over 60 % of the time, with the top-two pose families approaching a 75 % success rate. Correct poses among all those predicted were identified nearly 90 % of the time. The largest improvements came from the use of molecular similarity to improve ligand pose rankings and the strategy for identifying representative protein structures. With the exception of a single outlier target, the knowledge-guided docking protocol produced results matching the quality of cognate-ligand re-docking, but it did so on a very challenging temporally-segregated cross-docking benchmark.

[1]  Rocco Varela,et al.  Protein function annotation by local binding site surface similarity , 2014, Proteins.

[2]  Tingjun Hou,et al.  Assessing the Performance of the MM/PBSA and MM/GBSA Methods. 1. The Accuracy of Binding Free Energy Calculations Based on Molecular Dynamics Simulations , 2011, J. Chem. Inf. Model..

[3]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[4]  W Patrick Walters,et al.  A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance , 2004, Proteins.

[5]  Ajay N. Jain,et al.  Surface‐based protein binding pocket similarity , 2011, Proteins.

[6]  Paul N. Mortenson,et al.  Diverse, high-quality test set for the validation of protein-ligand docking performance. , 2007, Journal of medicinal chemistry.

[7]  Sudipto Mukherjee,et al.  Evaluation of DOCK 6 as a pose generation and database enrichment tool , 2012, Journal of Computer-Aided Molecular Design.

[8]  J. Schwabe,et al.  Structural Basis for the Activation of Pparg by Oxidised Fatty Acids , 2008 .

[9]  Richard J. Hall,et al.  Protein-Ligand Docking against Non-Native Protein Conformers , 2008, J. Chem. Inf. Model..

[10]  Richard D. Smith,et al.  CSAR Benchmark Exercise 2011–2012: Evaluation of Results from Docking and Relative Ranking of Blinded Congeneric Series , 2013, J. Chem. Inf. Model..

[11]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[12]  Ajay N. Jain,et al.  A structure-guided approach for protein pocket modeling and affinity prediction , 2013, Journal of Computer-Aided Molecular Design.

[13]  Ruben Abagyan,et al.  Docking and scoring with ICM: the benchmarking results and strategies for improvement , 2012, Journal of Computer-Aided Molecular Design.

[14]  Yongbo Hu,et al.  Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening Accuracy , 2009, J. Chem. Inf. Model..

[15]  Ajay N. Jain,et al.  Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery , 2008, J. Comput. Aided Mol. Des..

[16]  Steven W. Muchmore,et al.  High-Throughput Calculation of Protein-Ligand Binding Affinities: Modification and Adaptation of the MM-PBSA Protocol to Enterprise Grid Computing , 2006, J. Chem. Inf. Model..

[17]  J. Coleman Chemical reactions of sulfonamides with carbonic anhydrase. , 1975, Annual review of pharmacology.

[18]  Ajay N. Jain,et al.  Does your model weigh the same as a Duck? , 2011, Journal of Computer-Aided Molecular Design.

[19]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[20]  J. Buolamwini,et al.  CoMFA and CoMSIA 3D QSAR and docking studies on conformationally-restrained cinnamoyl HIV-1 integrase inhibitors: exploration of a binding mode at the active site. , 2002, Journal of medicinal chemistry.

[21]  Ajay N. Jain Surflex-Dock 2.1: Robust performance from ligand energetic modeling, ring flexibility, and knowledge-based search , 2007, J. Comput. Aided Mol. Des..

[22]  Ajay N. Jain Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. , 2003, Journal of medicinal chemistry.

[23]  Ajay N. Jain,et al.  Automatic identification and representation of protein binding sites for molecular docking , 1997, Protein science : a publication of the Protein Society.

[24]  Oliver Korb,et al.  Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test , 2012, Journal of Computer-Aided Molecular Design.

[25]  Holger Claussen,et al.  Substantial improvements in large-scale redocking and screening using the novel HYDE scoring function , 2012, Journal of Computer-Aided Molecular Design.

[26]  Ajay N. Jain,et al.  Molecular Shape and Medicinal Chemistry: A Perspective , 2010, Journal of medicinal chemistry.

[27]  J. Schwabe,et al.  Structural basis for the activation of PPARγ by oxidized fatty acids , 2008, Nature Structural &Molecular Biology.

[28]  Ajay N. Jain Effects of protein conformation in docking: improved pose prediction through protein pocket adaptation , 2009, J. Comput. Aided Mol. Des..

[29]  Ajay N. Jain Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities , 1996, J. Comput. Aided Mol. Des..

[30]  Fedor N. Novikov,et al.  Lead Finder docking and virtual screening evaluation with Astex and DUD test sets , 2012, Journal of Computer-Aided Molecular Design.

[31]  Anthony Nicholls,et al.  What do we know and when do we know it? , 2008, J. Comput. Aided Mol. Des..

[32]  Ajay N. Jain,et al.  Prediction of Off-Target Drug Effects Through Data Fusion , 2014, Pacific Symposium on Biocomputing.

[33]  Giulio Rastelli,et al.  Fast and accurate predictions of binding free energies using MM‐PBSA and MM‐GBSA , 2009, J. Comput. Chem..

[34]  Ajay N. Jain,et al.  Recommendations for evaluation of computational methods , 2008, J. Comput. Aided Mol. Des..

[35]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[36]  G. Degliesposti,et al.  Binding Estimation after Refinement, a New Automated Procedure for the Refinement and Rescoring of Docked Ligands in Virtual Screening , 2009, Chemical biology & drug design.

[37]  Richard A. Friesner,et al.  Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide , 2012, Journal of Computer-Aided Molecular Design.

[38]  Ajay N. Jain,et al.  Chemical structural novelty: on-targets and off-targets. , 2011, Journal of medicinal chemistry.

[39]  Ajay N. Jain,et al.  Chemical and protein structural basis for biological crosstalk between PPARα and COX enzymes , 2014, Journal of Computer-Aided Molecular Design.

[40]  A. N. Jain,et al.  Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. , 1996, Chemistry & biology.

[41]  D. Goodsell,et al.  Automated docking of substrates to proteins by simulated annealing , 1990, Proteins.

[42]  Thomas Lengauer,et al.  Multiple automatic base selection: Protein–ligand docking based on incremental construction without manual intervention , 1997, J. Comput. Aided Mol. Des..

[43]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[44]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[45]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[46]  Gareth Jones,et al.  A genetic algorithm for flexible molecular overlay and pharmacophore elucidation , 1995, J. Comput. Aided Mol. Des..

[47]  Michal Vieth,et al.  Lessons in Molecular Recognition, 2. Assessing and Improving Cross-Docking Accuracy , 2007, J. Chem. Inf. Model..

[48]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[49]  Paul Labute,et al.  Variability in docking success rates due to dataset preparation , 2012, Journal of Computer-Aided Molecular Design.

[50]  Ajay N. Jain,et al.  Surflex-Dock: Docking benchmarks and real-world application , 2012, Journal of Computer-Aided Molecular Design.