FlexAID: Revisiting docking on non native-complex structures

Small-molecule protein docking is an essential tool in drug design and to understand molecular recognition. In the present work we introduce FlexAID, a small-molecule docking algorithm that accounts for target side-chain flexibility and utilizes a soft scoring function, i.e. one that is not highly dependent on specific geometric criteria, based on surface complementarity. The pairwise energy parameters were derived from a large dataset of true positive poses and negative decoys from the PDBbind dataset through an iterative process using Monte Carlo simulations. The prediction of binding poses is tested using the independent Astex dataset while performance in virtual screening is evaluated using a subset of the DUD dataset. We compare FlexAID to AutoDock Vina, FlexX, and rDock in an extensive number of scenarios to understand the strengths and limitations of the different programs as well as to reported results for Glide, GOLD and DOCK6 where applicable. The most relevant among these scenarios is that of docking on flexible non native-complex structures where as is the case in reality, the target conformation in the bound form is not known a priori. We demonstrate that FlexAID, unlike other programs, is robust against increasing structural variability. FlexAID obtains equivalent sampling success as GOLD and performs better than AutoDock Vina or FlexX in all scenarios against non native-complex structures. FlexAID is better than rDock when there is at least one critical side-chain movement required upon ligand binding. In virtual screening, FlexAID rescored results are comparable to those of AutoDock Vina and rDock. The higher accuracy in flexible targets where critical movements are required, intuitive PyMOL-integrated graphical user interface and free source code as well as pre-compiled executables for Windows, Linux and Mac OS make FlexAID a welcome addition to the arsenal of existing small-molecule protein docking methods. AUTHOR SUMMARY Protein ligand interactions are essential to understand biological processes such as enzymatic reactions, signalling pathways as well as in the development of new medicines. Docking algorithms permit to predict the structure of a ligand protein complex at the atomic level. Several docking algorithms were developed over the years with a tendency towards utilizing very specific and detailed (i.e., hard) descriptions of molecular interactions. In this work we present a new docking algorithm called FlexAID that utilizes a very general and superficial (i.e., soft) description of interactions based on atomic surface areas in contact. We demonstrate that FlexAID can achieve better accuracy in predicting the structure of ligand protein complexes than existing accessible widely used or state-of-the-art methods in real scenarios when using flexible targets harbouring structural differences with respect to the final protein structure present in the ligand protein complex. FlexAID and its PyMOL-integrated graphical user interface are free, easy to use and available for Windows, Linux and Mac OS.

[1]  Debashis Kushary,et al.  Bootstrap Methods and Their Application , 2000, Technometrics.

[2]  Xavier Barril,et al.  rDock: A Fast, Versatile and Open Source Program for Docking Ligands to Proteins and Nucleic Acids , 2014, PLoS Comput. Biol..

[3]  R. Cramer,et al.  Validation of the general purpose tripos 5.2 force field , 1989 .

[4]  Brendan J. McConkey,et al.  Quantification of protein surfaces, volumes and atom-atom contacts using a constrained Voronoi procedure , 2002, Bioinform..

[5]  V. Sobolev,et al.  Modeling the quinone‐B binding site of the photosystem‐II reaction center using notions of complementarity and contact‐surface between atoms , 1995, Proteins.

[6]  Martin Stahl,et al.  Binding site characteristics in structure-based virtual screening: evaluation of current docking tools , 2003, Journal of molecular modeling.

[7]  Leslie A Kuhn,et al.  Side‐chain flexibility in protein–ligand binding: The minimal rotation hypothesis , 2005, Protein science : a publication of the Protein Society.

[8]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[9]  Robin Taylor,et al.  Comparing protein–ligand docking programs is difficult , 2005, Proteins.

[10]  Renxiao Wang,et al.  The PDBbind database: methodologies and updates. , 2005, Journal of medicinal chemistry.

[11]  Richard D. Taylor,et al.  Improved protein–ligand docking using GOLD , 2003, Proteins.

[12]  I. Kuntz,et al.  Ligand solvation in molecular docking , 1999, Proteins.

[13]  I. Kuntz,et al.  Protein docking and complementarity. , 1991, Journal of molecular biology.

[14]  C. Chothia,et al.  The Packing Density in Proteins: Standard Radii and Volumes , 1999 .

[15]  J M Thornton,et al.  Protein side-chain conformation: a systematic variation of chi 1 mean values with resolution - a consequence of multiple rotameric states? , 1999, Acta crystallographica. Section D, Biological crystallography.

[16]  Ajay N. Jain,et al.  Parameter estimation for scoring protein-ligand interactions using negative training data. , 2006, Journal of medicinal chemistry.

[17]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[18]  Sandor Vajda,et al.  Identification of hot spots within druggable binding regions by computational solvent mapping of proteins. , 2007, Journal of medicinal chemistry.

[19]  G. Vriend,et al.  Molecular docking using surface complementarity , 1996, Proteins.

[20]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998, J. Comput. Chem..

[21]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[22]  Robert P. Sheridan,et al.  Flexibases: A way to enhance the use of molecular docking methods , 1994, J. Comput. Aided Mol. Des..

[23]  C. Tanford,et al.  The hydrophobic effect and the organization of living matter. , 1978, Science.

[24]  Gerhard Klebe,et al.  DSX: A Knowledge-Based Scoring Function for the Assessment of Protein-Ligand Complexes , 2011, J. Chem. Inf. Model..

[25]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[26]  Ivet Bahar,et al.  Optimal design of protein docking potentials: Efficiency and limitations , 2005, Proteins.

[27]  Rafael Najmanovich,et al.  Side-chain rotamer changes upon ligand binding: common, crucial, correlate with entropy and rearrange hydrogen bonding , 2012, Bioinform..

[28]  A J Olson,et al.  Analysis of a data set of paired uncomplexed protein structures: New metrics for side‐chain flexibility and model evaluation , 2001, Proteins.

[29]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[30]  Fedor N. Novikov,et al.  Lead Finder docking and virtual screening evaluation with Astex and DUD test sets , 2012, Journal of Computer-Aided Molecular Design.

[31]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[32]  John J. Irwin,et al.  Community benchmarks for virtual screening , 2008, J. Comput. Aided Mol. Des..

[33]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[34]  R. Nussinov,et al.  The role of dynamic conformational ensembles in biomolecular recognition. , 2009, Nature chemical biology.

[35]  Brian K. Shoichet,et al.  Statistical Potential for Modeling and Ranking of Protein-Ligand Interactions , 2011, J. Chem. Inf. Model..

[36]  R Samudrala,et al.  Decoys ‘R’ Us: A database of incorrect conformations to improve protein structure prediction , 2000, Protein science : a publication of the Protein Society.

[37]  B. Shoichet,et al.  Flexible ligand docking using conformational ensembles , 1998, Protein science : a publication of the Protein Society.

[38]  Anna Vulpetti,et al.  Assessment of Docking Poses: Interactions-Based Accuracy Classification (IBAC) versus Crystal Structure Deviations , 2004, J. Chem. Inf. Model..

[39]  Richard J. Hall,et al.  Docking performance of fragments and druglike compounds. , 2011, Journal of medicinal chemistry.

[40]  Lalit M. Patnaik,et al.  Adaptive probabilities of crossover and mutation in genetic algorithms , 1994, IEEE Trans. Syst. Man Cybern..

[41]  J. Thornton,et al.  Conformational changes observed in enzyme crystal structures upon substrate binding. , 2005, Journal of molecular biology.

[42]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[43]  B. Shoichet,et al.  Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. , 2003, Journal of medicinal chemistry.

[44]  Rafael Najmanovich,et al.  Side‐chain flexibility in proteins upon ligand binding , 2000, Proteins.

[45]  Paul N. Mortenson,et al.  Diverse, high-quality test set for the validation of protein-ligand docking performance. , 2007, Journal of medicinal chemistry.

[46]  Sudipto Mukherjee,et al.  Evaluation of DOCK 6 as a pose generation and database enrichment tool , 2012, Journal of Computer-Aided Molecular Design.

[47]  Gennady Verkhivker,et al.  Deciphering common failures in molecular docking of ligand-protein complexes , 2000, J. Comput. Aided Mol. Des..

[48]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[49]  Richard A. Friesner,et al.  Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide , 2012, Journal of Computer-Aided Molecular Design.

[50]  Richard J. Hall,et al.  Protein-Ligand Docking against Non-Native Protein Conformers , 2008, J. Chem. Inf. Model..

[51]  Thomas Lengauer,et al.  FlexE: efficient molecular docking considering protein structure variations. , 2001, Journal of molecular biology.

[52]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[53]  Ruben Abagyan,et al.  Docking and scoring with ICM: the benchmarking results and strategies for improvement , 2012, Journal of Computer-Aided Molecular Design.

[54]  Brian K. Shoichet,et al.  Molecular docking using shape descriptors , 1992 .

[55]  Dima Kozakov,et al.  Sampling and scoring: A marriage made in heaven , 2013, Proteins.

[56]  B. Shoichet,et al.  Soft docking and multiple receptor conformations in virtual screening. , 2004, Journal of medicinal chemistry.

[57]  Claudio N. Cavasotto,et al.  Representing receptor flexibility in ligand docking through relevant normal modes. , 2005, Journal of the American Chemical Society.

[58]  Renxiao Wang,et al.  Comparative evaluation of 11 scoring functions for molecular docking. , 2003, Journal of medicinal chemistry.

[59]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[60]  J P Changeux,et al.  On the nature of allosteric transitions: implications of non-exclusive ligand binding. , 1966, Journal of molecular biology.