DockingApp RF: A State-of-the-Art Novel Scoring Function for Molecular Docking in a User-Friendly Interface to AutoDock Vina

Motivation: Bringing a new drug to the market is expensive and time-consuming. To cut the costs and time, computer-aided drug design (CADD) approaches have been increasingly included in the drug discovery pipeline. However, despite traditional docking tools show a good conformational space sampling ability, they are still unable to produce accurate binding affinity predictions. This work presents a novel scoring function for molecular docking seamlessly integrated into DockingApp, a user-friendly graphical interface for AutoDock Vina. The proposed function is based on a random forest model and a selection of specific features to overcome the existing limits of Vina’s original scoring mechanism. A novel version of DockingApp, named DockingApp RF, has been developed to host the proposed scoring function and to automatize the rescoring procedure of the output of AutoDock Vina, even to nonexpert users. Results: By coupling intermolecular interaction, solvent accessible surface area features and Vina’s energy terms, DockingApp RF’s new scoring function is able to improve the binding affinity prediction of AutoDock Vina. Furthermore, comparison tests carried out on the CASF-2013 and CASF-2016 datasets demonstrate that DockingApp RF’s performance is comparable to other state-of-the-art machine-learning- and deep-learning-based scoring functions. The new scoring function thus represents a significant advancement in terms of the reliability and effectiveness of docking compared to AutoDock Vina’s scoring function. At the same time, the characteristics that made DockingApp appealing to a wide range of users are retained in this new version and have been complemented with additional features.

[1]  Kwong-Sak Leung,et al.  Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets , 2015, Molecular informatics.

[2]  Hongjian Li,et al.  Machine‐learning scoring functions for structure‐based drug lead optimization , 2020, WIREs Computational Molecular Science.

[3]  Artem Cherkasov,et al.  Best Practices of Computer-Aided Drug Discovery: Lessons Learned from the Development of a Preclinical Candidate for Prostate Cancer with a New Mechanism of Action , 2017, J. Chem. Inf. Model..

[4]  Marta M. Stepniewska-Dziubinska,et al.  Development and evaluation of a deep learning model for protein–ligand binding affinity prediction , 2017, Bioinform..

[5]  Fabio Polticelli,et al.  ASSIST: a fast versatile local structural comparison tool , 2014, Bioinform..

[6]  Asad U Khan,et al.  Structure based virtual screening to discover putative drug candidates: necessary considerations and successful case studies. , 2015, Methods.

[7]  Yan Li,et al.  Comparative Assessment of Scoring Functions: The CASF-2016 Update , 2018, J. Chem. Inf. Model..

[8]  Edward W. Lowe,et al.  Computational Methods in Drug Discovery , 2014, Pharmacological Reviews.

[9]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[10]  D. Velmurugan,et al.  Docking-based virtual screening of known drugs against murE of Mycobacterium tuberculosis towards repurposing for TB , 2016, Bioinformation.

[11]  I. Kuntz,et al.  Inclusion of Solvation in Ligand Binding Free Energy Calculations Using the Generalized-Born Model , 1999 .

[12]  Zhiqiang Yan,et al.  Optimizing the affinity and specificity of ligand binding with the inclusion of solvation effect , 2015, Proteins.

[13]  Liliane Mouawad,et al.  Efficient conformational sampling and weak scoring in docking programs? Strategy of the wisdom of crowds , 2017, Journal of Cheminformatics.

[14]  Tom L. Blundell,et al.  Does a More Precise Chemical Description of Protein–Ligand Complexes Lead to More Accurate Prediction of Binding Affinity? , 2014, J. Chem. Inf. Model..

[15]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[16]  Luhua Lai,et al.  Further development and validation of empirical scoring functions for structure-based binding affinity prediction , 2002, J. Comput. Aided Mol. Des..

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Daniele Toti,et al.  Fragment-Based Ligand-Protein Contact Statistics: Application to Docking Simulations , 2019, International journal of molecular sciences.

[19]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[20]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[21]  I. Muegge PMF scoring revisited. , 2006, Journal of medicinal chemistry.

[22]  Gaoang Wang,et al.  Beware of the generic machine learning-based scoring functions in structure-based virtual screening , 2020, Briefings Bioinform..

[23]  R. W. Hansen,et al.  Journal of Health Economics , 2016 .

[24]  Gianni De Fabritiis,et al.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks , 2018, J. Chem. Inf. Model..

[25]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[26]  Guo-Wei Wei,et al.  AGL-Score: Algebraic Graph Learning Score for Protein-Ligand Binding Scoring, Ranking, Docking, and Screening , 2019, J. Chem. Inf. Model..

[27]  Garrett M Morris,et al.  Learning from the ligand: using ligand-based features to improve binding affinity prediction , 2020, Bioinform..

[28]  Daniele Toti,et al.  DockingApp: a user friendly interface for facilitated docking simulations with AutoDock Vina , 2017, Journal of Computer-Aided Molecular Design.

[29]  Simon Mitternacht,et al.  FreeSASA: An open source C library for solvent accessible surface area calculations , 2016, F1000Research.

[30]  Cheng Wang,et al.  Improving scoring‐docking‐screening powers of protein–ligand scoring functions using random forest , 2017, J. Comput. Chem..

[31]  Zhihai Liu,et al.  Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results , 2014, J. Chem. Inf. Model..

[32]  Pedro Alexandrino Fernandes,et al.  Calculation of distribution coefficients in the SAMPL5 challenge from atomic solvation parameters and surface areas , 2016, Journal of Computer-Aided Molecular Design.

[33]  Fabio Polticelli,et al.  LIBRA-WA: a web application for ligand binding site detection and protein function recognition , 2018, Bioinform..

[34]  L. Dardenne,et al.  Empirical Scoring Functions for Structure-Based Virtual Screening: Applications, Critical Aspects, and Challenges , 2018, Front. Pharmacol..

[35]  Identification of new EphA4 inhibitors by virtual screening of FDA-approved drugs , 2018, Scientific Reports.

[36]  Martin Stahl,et al.  The Use of Scoring Functions in Drug Discovery Applications , 2003 .

[37]  Yanjie Wei,et al.  DeepBindRG: a deep learning based method for estimating effective protein–ligand affinity , 2019, PeerJ.

[38]  Fabio Polticelli,et al.  LIBRA: LIgand Binding site Recognition Application , 2015, Bioinform..

[39]  Pedro J. Ballester,et al.  Performance of machine-learning scoring functions in structure-based virtual screening , 2017, Scientific Reports.

[40]  Francisco Adasme-Carreño,et al.  Binding-affinity predictions of HSP90 in the D3R Grand Challenge 2015 with docking, MM/GBSA, QM/MM, and free-energy simulations , 2016, Journal of Computer-Aided Molecular Design.

[41]  Yuguang Mu,et al.  OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein–Ligand Binding Affinity Prediction , 2019, ACS omega.

[42]  Lingling Jiang,et al.  Pharmacophore-Based Similarity Scoring for DOCK , 2014, The journal of physical chemistry. B.

[43]  Rui Duan,et al.  Lessons learned from participating in D3R 2016 Grand Challenge 2: compounds targeting the farnesoid X receptor , 2017, Journal of Computer-Aided Molecular Design.

[44]  João Rodrigues,et al.  Why and how have drug discovery strategies in pharma changed? What are the new mindsets? , 2016, Drug discovery today.

[45]  Jie Li,et al.  PDB-wide collection of binding data: current status of the PDBbind database , 2015, Bioinform..

[46]  Shengrui Wang,et al.  A novel hierarchical clustering algorithm for gene sequences , 2012, BMC Bioinformatics.

[47]  David S. Goodsell,et al.  AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility , 2009, J. Comput. Chem..

[48]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[49]  John B. O. Mitchell,et al.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking , 2010, Bioinform..

[50]  Chi Heem Wong,et al.  Estimation of clinical trial success rates and related parameters , 2018, Biostatistics.

[51]  Esben J. Bjerrum,et al.  Machine learning optimization of cross docking accuracy , 2016, Comput. Biol. Chem..

[52]  Jie Liu,et al.  Classification of Current Scoring Functions , 2015, J. Chem. Inf. Model..

[53]  Talambedu Usha,et al.  Recent Updates on Computer-aided Drug Discovery: Time for a Paradigm Shift. , 2017, Current topics in medicinal chemistry.

[54]  G. Klebe,et al.  DrugScore(CSD)-knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of near-native ligand poses and better affinity prediction. , 2005, Journal of medicinal chemistry.

[55]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[56]  I. Kuntz,et al.  Automated docking with grid‐based energy evaluation , 1992 .

[57]  Marcel L Verdonk,et al.  General and targeted statistical potentials for protein–ligand interactions , 2005, Proteins.

[58]  M. Gilson,et al.  A new class of models for computing receptor-ligand binding affinities. , 1997, Chemistry & biology.