Docking Ligands into Flexible and Solvated Macromolecules. 5. Force-Field-Based Prediction of Binding Affinities of Ligands to Proteins

We report herein our efforts in the development of three empirical scoring functions with application in protein-ligand docking. A first scoring function was developed from 209 crystal structures of protein-ligand complexes and a second one from 946 cross-docked complexes. Tuning of the coefficients for the different terms making up these functions was performed by an iterative approach to optimize the correlations between observed activities and calculated scores. A third scoring function was developed from libraries of known actives and decoys docked to six different protein conformational ensembles. In the latter case, the tuning of the coefficients was performed so as to optimize the area under the curve of a receiver operating characteristic (ROC) for the discrimination of actives and inactives. The newly developed scoring functions were next assessed on independent sets of protein-ligand complexes for their ability to predict binding affinities and to discriminate actives from inactives. In the first validation the first function, which was trained on active compounds only, performed as well as other commonly used ones. On a high-throughput virtual screening validation on five protein conformational ensembles, the third scoring function that included data from inactive compounds performed significantly better. This validation showed that the inclusion of data from inactive compounds is critical for performance in virtual high-throughput screening applications.