Scoring confidence index: statistical evaluation of ligand binding mode predictions

Protein-ligand docking programs can generate a large number of possible binding orientations for each ligand candidate. The challenge is to identify the orientations closest to the native binding mode using a scoring method. Many different scoring functions have been developed for protein-ligand scoring, but their performance on binding mode prediction is often target-dependent. In this study, a statistical approach was employed to provide a confidence measure of scoring performance in finding close to the correct docked ligand orientations. It exploits the fact that the scores provided by an adequately performing scoring function generally improve as the ligand binding modes get closer to the correct native orientation. For such cases, the correlation coefficient of scores versus distances is expected to be highest when the most native-like orientation is used as a reference. This correlation coefficient, called the correlation-based score (CBScore), was used as an indicator of how far the docked pose was from the native orientation. The correlation between the original scores and CBScores as well as the range of CBScores were found to be good measures of scoring performance. They were combined into a single quantity, called the scoring confidence index. High values of the scoring confidence index were indicative of pronounced and relatively smooth binding energy landscapes with easily discernable global minima, resulting in reliable binding mode predictions. Low values of this index reflected rugged energy landscapes making the prediction of the correct binding mode very difficult and often unreliable. The diagnostic ability of the scoring confidence index was tested on a non-redundant set of 50 protein-ligand complexes scored with three commonly employed scoring functions: AffiScore, DrugScore and X-Score. Binding mode predictions were found to be three times more reliable for complexes with scoring confidence indices in the upper half than for cases with values in the lower half of the resulting range of 0–1.6. This new confidence measure of scoring performance is expected to be a valuable tool for virtual screening applications.

[1]  R. Nussinov,et al.  Folding funnels, binding funnels, and protein function , 1999, Protein science : a publication of the Protein Society.

[2]  Shaomeng Wang,et al.  An Extensive Test of 14 Scoring Functions Using the PDBbind Refined Set of 800 Protein-Ligand Complexes , 2004, J. Chem. Inf. Model..

[3]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[4]  G. Klebe,et al.  DrugScore meets CoMFA: adaptation of fields for molecular comparison (AFMoC) or how to tailor knowledge-based pair-potentials to a particular protein. , 2002, Journal of medicinal chemistry.

[5]  Maria Kontoyianni,et al.  Evaluation of docking performance: comparative data on docking algorithms. , 2004, Journal of medicinal chemistry.

[6]  Meir Glick,et al.  Streamlining lead discovery by aligning in silico and high-throughput screening. , 2006, Current opinion in chemical biology.

[7]  W Patrick Walters,et al.  A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance , 2004, Proteins.

[8]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[9]  L. Kuhn,et al.  Virtual screening with solvation and ligand-induced complementarity , 2000 .

[10]  G. Klebe Virtual ligand screening: strategies, perspectives and limitations , 2006, Drug Discovery Today.

[11]  Michael Feig,et al.  A correlation‐based method for the enhancement of scoring functions on funnel‐shaped energy landscapes , 2006, Proteins.

[12]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[13]  Luhua Lai,et al.  Further development and validation of empirical scoring functions for structure-based binding affinity prediction , 2002, J. Comput. Aided Mol. Des..

[14]  Renxiao Wang,et al.  Comparative evaluation of 11 scoring functions for molecular docking. , 2003, Journal of medicinal chemistry.

[15]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[16]  Maria I. Zavodszky,et al.  Distilling the essential features of a protein surface for improving protein-ligand docking, scoring, and virtual screening , 2002, J. Comput. Aided Mol. Des..

[17]  Jin Li,et al.  On Evaluating Molecular-Docking Methods for Pose Prediction and Enrichment Factors , 2006, J. Chem. Inf. Model..

[18]  C L Brooks,et al.  Ligand-protein database: linking protein-ligand complex structures to binding data. , 2001, Journal of medicinal chemistry.

[19]  Brian K Shoichet,et al.  Prediction of protein-ligand interactions. Docking and scoring: successes and gaps. , 2006, Journal of medicinal chemistry.

[20]  Martin Stahl,et al.  Scoring functions for protein-ligand interactions: a critical perspective. , 2004, Drug discovery today. Technologies.

[21]  Leslie A Kuhn,et al.  Side‐chain flexibility in protein–ligand binding: The minimal rotation hypothesis , 2005, Protein science : a publication of the Protein Society.

[22]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.