Normalizing Molecular Docking Rankings using Virtually Generated Decoys

Drug discovery research often relies on the use of virtual screening via molecular docking to identify active hits in compound libraries. An area for improvement among many state-of-the-art docking methods is the accuracy of the scoring functions used to differentiate active from nonactive ligands. Many contemporary scoring functions are influenced by the physical properties of the docked molecule. This bias can cause molecules with certain physical properties to incorrectly score better than others. Since variation in physical properties is inevitable in large screening libraries, it is desirable to account for this bias. In this paper, we present a method of normalizing docking scores using virtually generated decoy sets with matched physical properties. First, our method generates a set of property-matched decoys for every molecule in the screening library. Each library molecule and its decoy set are docked using a state-of-the-art method, producing a set of raw docking scores. Next, the raw docking score of each library molecule is normalized against the scores of its decoys. The normalized score represents the probability that the raw docking score was drawn from the background distribution of nonactive property-matched decoys. Assuming that the distribution of scores of active molecules differs from the nonactive score distribution, we expect that the score of an active compound will have a low probability of having been drawn from the nonactive score distribution. In addition to the use of decoys in normalizing docking scores, we suggest that decoy sets may be a useful tool to evaluate, improve, or develop scoring functions. We show that by analyzing docking scores of library molecules with respect to the docking scores of their virtually generated property-matched decoys, one can gain insight into the advantages, limitations, and reliability of scoring functions.

[1]  Egon L. Willighagen,et al.  The Blue Obelisk—Interoperability in Chemical Informatics , 2006, J. Chem. Inf. Model..

[2]  Aniko Simon,et al.  eHiTS: a new fast, exhaustive flexible ligand docking system. , 2007, Journal of molecular graphics & modelling.

[3]  H. Matter,et al.  Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors. , 1997, Journal of medicinal chemistry.

[4]  Scott P. Brown,et al.  A unified, probabilistic framework for structure- and ligand-based virtual screening. , 2011, Journal of medicinal chemistry.

[5]  Izhar Wallach,et al.  Virtual Decoy Sets for Molecular Docking Benchmarks , 2011, J. Chem. Inf. Model..

[6]  Y. Fukunishi,et al.  Similarities among receptor pockets and among compounds: analysis and application to in silico ligand screening. , 2005, Journal of molecular graphics & modelling.

[7]  Roland L. Dunbrack Rotamer libraries in the 21st century. , 2002, Current opinion in structural biology.

[8]  Nicolas Moitessier,et al.  Docking Ligands into Flexible and Solvated Macromolecules. 4. Are Popular Scoring Functions Accurate for this Class of Proteins? , 2009, J. Chem. Inf. Model..

[9]  J. Irwin,et al.  ZINC ? A Free Database of Commercially Available Compounds for Virtual Screening. , 2005 .

[10]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[11]  Christine Humblet,et al.  Biased retrieval of chemical series in receptor-based virtual screening , 2010, J. Comput. Aided Mol. Des..

[12]  Brian K. Shoichet,et al.  Rapid Context-Dependent Ligand Desolvation in Molecular Docking , 2010, J. Chem. Inf. Model..

[13]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[14]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[15]  Anders Karlén,et al.  Ligand Bias of Scoring Functions in Structure-Based Virtual Screening , 2006, J. Chem. Inf. Model..

[16]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[17]  G. Vigers,et al.  Multiple active site corrections for docking and virtual screening. , 2004, Journal of medicinal chemistry.

[18]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[19]  J. A. Grant,et al.  Gaussian docking functions. , 2003, Biopolymers.