Lead Finder docking and virtual screening evaluation with Astex and DUD test sets

Lead Finder is a molecular docking software. Sampling uses an original implementation of the genetic algorithm that involves a number of additional optimization procedures. Lead Finder’s scoring functions employ a set of semi-empiric molecular mechanics functionals that have been parameterized independently for docking, binding energy predictions and rank-ordering for virtual screening. Sampling and scoring both utilize a staged approach, moving from fast but less accurate algorithm versions to computationally more intensive but more accurate versions. Lead Finder includes tools for the preparation of full atom protein and ligand models. In this exercise, Lead Finder achieved 72.9% docking success rate on the Astex test set when the original author-prepared full atom models were used, and 74.1% success rate when the structures were prepared by Lead Finder. The major cause of docking failures were scoring errors resulting from the use of imperfect solvation models. In many cases, docking errors could be corrected by the proper protonation and the use of correct cyclic conformations of ligands. In virtual screening experiments on the DUD test set the early enrichment factor of several tens was achieved on average. However, the area under the ROC curve (“AUC ROC”) ranged from 0.70 to 0.74 depending on the screening protocol used, and the separation from the null model was not perfect—0.12–0.15 units of AUC ROC. We assume that effective virtual screening in the whole range of enrichment curve and not just at the early enrichment stages requires more accurate solvation modeling and accounting for the protein backbone flexibility.

[1]  Robin Taylor,et al.  A new test set for validating predictions of protein–ligand interaction , 2002, Proteins.

[2]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[3]  Richard D. Smith,et al.  CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring Functions , 2011, J. Chem. Inf. Model..

[4]  Yongbo Hu,et al.  Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening Accuracy , 2009, J. Chem. Inf. Model..

[5]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[6]  Oleg V Stroganov,et al.  Improving performance of docking-based virtual screening by structural filtration , 2010, Journal of molecular modeling.

[7]  Fedor N. Novikov,et al.  Lead finder: an approach to improve accuracy of protein-ligand docking, binding energy estimation, and virtual screening. , 2008, Journal of chemical information and modeling.

[8]  Fedor N. Novikov,et al.  Developing novel approaches to improve binding energy estimation and virtual screening: a PARP case study , 2009, Journal of molecular modeling.

[9]  Dariusz Plewczynski,et al.  Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database , 2011, J. Comput. Chem..

[10]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[11]  Fedor N. Novikov,et al.  CSAR Scoring Challenge Reveals the Need for New Concepts in Estimating Protein-Ligand Binding Affinity , 2011, J. Chem. Inf. Model..

[12]  Paul N. Mortenson,et al.  Diverse, high-quality test set for the validation of protein-ligand docking performance. , 2007, Journal of medicinal chemistry.

[13]  Alexey A. Zeifman,et al.  TSAR, a new graph–theoretical approach to computational modeling of protein side‐chain flexibility: Modeling of ionization properties of proteins , 2011, Proteins: Structure, Function, and Bioinformatics.