An Overview of Scoring Functions Used for Protein–Ligand Interactions in Molecular Docking

Currently, molecular docking is becoming a key tool in drug discovery and molecular modeling applications. The reliability of molecular docking depends on the accuracy of the adopted scoring function, which can guide and determine the ligand poses when thousands of possible poses of ligand are generated. The scoring function can be used to determine the binding mode and site of a ligand, predict binding affinity and identify the potential drug leads for a given protein target. Despite intensive research over the years, accurate and rapid prediction of protein–ligand interactions is still a challenge in molecular docking. For this reason, this study reviews four basic types of scoring functions, physics-based, empirical, knowledge-based, and machine learning-based scoring functions, based on an up-to-date classification scheme. We not only discuss the foundations of the four types scoring functions, suitable application areas and shortcomings, but also discuss challenges and potential future study directions.

[1]  Bo Ding,et al.  Characterizing Binding of Small Molecules. II. Evaluating the Potency of Small Molecules to Combat Resistance Based on Docking Structures , 2013, J. Chem. Inf. Model..

[2]  Martin Frank,et al.  Computation of Binding Energies Including Their Enthalpy and Entropy Components for Protein-Ligand Complexes Using Support Vector Machines , 2013, J. Chem. Inf. Model..

[3]  Christine Humblet,et al.  Lead optimization via high-throughput molecular docking. , 2007, Current opinion in drug discovery & development.

[4]  Ashutosh Kumar,et al.  Investigation on the Effect of Key Water Molecules on Docking Performance in CSARdock Exercise , 2013, J. Chem. Inf. Model..

[5]  Natasja Brooijmans,et al.  Molecular recognition and docking algorithms. , 2003, Annual review of biophysics and biomolecular structure.

[6]  Zheng Zheng,et al.  Development of the Knowledge-Based and Empirical Combined Scoring Algorithm (KECSA) To Score Protein-Ligand Interactions , 2013, J. Chem. Inf. Model..

[7]  Guo-Wei Wei,et al.  Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening , 2017, PLoS Comput. Biol..

[8]  Gerhard Klebe,et al.  SFCscore: Scoring functions for affinity prediction of protein–ligand complexes , 2008, Proteins.

[9]  Jianpeng Ma,et al.  OPUS-PSP: an orientation-dependent statistical all-atom potential derived from side-chain packing. , 2008, Journal of molecular biology.

[10]  Charles Bergeron,et al.  Binding Affinity Prediction for Ligands and Receptors Forming Tautomers and Ionization Species: Inhibition of Mitogen-Activated Protein Kinase-Activated Protein Kinase 2 (MK2) , 2012, Journal of medicinal chemistry.

[11]  Didier Rognan,et al.  Beware of Machine Learning-Based Scoring Functions - On the Danger of Developing Black Boxes , 2014, J. Chem. Inf. Model..

[12]  Felice C Lightstone,et al.  Approaches to efficiently estimate solvation and explicit water energetics in ligand binding: the use of WaterMap , 2013, Expert opinion on drug discovery.

[13]  Ajay N. Jain Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. , 2003, Journal of medicinal chemistry.

[14]  Julien Michel,et al.  Prediction of the water content in protein binding sites. , 2009, The journal of physical chemistry. B.

[15]  Tom L. Blundell,et al.  Does a More Precise Chemical Description of Protein–Ligand Complexes Lead to More Accurate Prediction of Binding Affinity? , 2014, J. Chem. Inf. Model..

[16]  C. Venkatachalam,et al.  LigScore: a novel scoring function for predicting binding affinities. , 2005, Journal of molecular graphics & modelling.

[17]  Vincent Zoete,et al.  On-the-Fly QM/MM Docking with Attracting Cavities , 2017, J. Chem. Inf. Model..

[18]  María Del Carmen Marín,et al.  An Average Solvent Electrostatic Configuration Protocol for QM/MM Free Energy Optimization: Implementation and Application to Rhodopsin Systems. , 2017, Journal of chemical theory and computation.

[19]  Kwong-Sak Leung,et al.  Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets , 2015, Molecular informatics.

[20]  Martin A. Olsson,et al.  Relative Ligand-Binding Free Energies Calculated from Multiple Short QM/MM MD Simulations. , 2018, Journal of chemical theory and computation.

[21]  Gerhard Klebe,et al.  DSX: A Knowledge-Based Scoring Function for the Assessment of Protein-Ligand Complexes , 2011, J. Chem. Inf. Model..

[22]  Jacob D. Durrant,et al.  NNScore 2.0: A Neural-Network Receptor–Ligand Scoring Function , 2011, J. Chem. Inf. Model..

[23]  Jibin Shen,et al.  A genetic algorithm- back propagation artificial neural network model to quantify the affinity of flavonoids toward P-glycoprotein. , 2014, Combinatorial chemistry & high throughput screening.

[24]  Pietro Cozzini,et al.  Simple, intuitive calculations of free energy of binding for protein-ligand complexes. 3. The free energy contribution of structural water molecules in HIV-1 protease complexes. , 2004, Journal of medicinal chemistry.

[25]  Razvan Andonie,et al.  A genetic algorithm optimized fuzzy neural network analysis of the affinity of inhibitors for HIV-1 protease. , 2008, Bioorganic & medicinal chemistry.

[26]  Matthew P. Repasky,et al.  Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. , 2006, Journal of medicinal chemistry.

[27]  Shaoxiang Zhang,et al.  Using game theory to investigate the epigenetic control mechanisms of embryo development: Comment on: "Epigenetic game theory: How to compute the epigenetic control of maternal-to-zygotic transition" by Qian Wang et al. , 2017, Physics of life reviews.

[28]  Tingjun Hou,et al.  Insight into Crizotinib Resistance Mechanisms Caused by Three Mutations in ALK Tyrosine Kinase using Free Energy Calculation Approaches , 2013, J. Chem. Inf. Model..

[29]  Mihai V. Putz,et al.  Learning the Relationship between the Primary Structure of HIV Envelope Glycoproteins and Neutralization Activity of Particular Antibodies by Using Artificial Neural Networks , 2016, International journal of molecular sciences.

[30]  Walter Thiel,et al.  QM/MM Methods for Biomolecular Systems , 2009 .

[31]  Jianpeng Ma,et al.  OPUS-DOSP: A Distance- and Orientation-Dependent All-Atom Potential Derived from Side-Chain Packing. , 2017, Journal of molecular biology.

[32]  Shaomeng Wang,et al.  M-score: a knowledge-based potential scoring function accounting for protein atom mobility. , 2006, Journal of medicinal chemistry.

[33]  Xiaoqin Zou,et al.  Scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directions. , 2010, Physical chemistry chemical physics : PCCP.

[34]  Kwong-Sak Leung,et al.  Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study , 2014, BMC Bioinformatics.

[35]  A. Al-Sadek,et al.  Improving classical scoring functions using random forest: The non‐additivity of free energy terms’ contributions in binding , 2018, Chemical biology & drug design.

[36]  G. Klebe,et al.  DrugScore(CSD)-knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of near-native ligand poses and better affinity prediction. , 2005, Journal of medicinal chemistry.

[37]  Dik-Lung Ma,et al.  Drug repositioning by structure-based virtual screening. , 2013, Chemical Society reviews.

[38]  Jinyan Li,et al.  Binding Affinity Prediction for Protein-Ligand Complexes Based on β Contacts and B Factor , 2013, J. Chem. Inf. Model..

[39]  Marcel L Verdonk,et al.  General and targeted statistical potentials for protein–ligand interactions , 2005, Proteins.

[40]  C. Supuran,et al.  Development of a Fingerprint-Based Scoring Function for the Prediction of the Binding Mode of Carbonic Anhydrase II Inhibitors , 2018, International journal of molecular sciences.

[41]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998, J. Comput. Chem..

[42]  Y. Martin,et al.  A general and fast scoring function for protein-ligand interactions: a simplified potential approach. , 1999, Journal of medicinal chemistry.

[43]  Yair Neuman,et al.  A Novel Procedure for Measuring Semantic Synergy , 2017, Complex..

[44]  Jacob D. Durrant,et al.  Comparing Neural-Network Scoring Functions and the State of the Art: Applications to Common Library Screening , 2013, J. Chem. Inf. Model..

[45]  Hanoch Senderowitz,et al.  SeleX-CS: A New Consensus Scoring Algorithm for Hit Discovery and Lead Optimization , 2009, J. Chem. Inf. Model..

[46]  Dariusz Plewczynski,et al.  VoteDock: Consensus docking method for prediction of protein–ligand interactions , 2011, J. Comput. Chem..

[47]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[48]  Weijun Wang,et al.  Protein-Ligand Empirical Interaction Components for Virtual Screening , 2017, J. Chem. Inf. Model..

[49]  Sergei Grudinin,et al.  Convex-PL: a novel knowledge-based potential for protein-ligand interactions deduced from structural databases using convex optimization , 2017, Journal of Computer-Aided Molecular Design.

[50]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[51]  H. Kulik Large-scale QM/MM free energy simulations of enzyme catalysis reveal the influence of charge transfer. , 2018, Physical chemistry chemical physics : PCCP.

[52]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Walter Thiel,et al.  QM/MM methods for biomolecular systems. , 2009, Angewandte Chemie.

[54]  Vincent Zoete,et al.  Toward On-The-Fly Quantum Mechanical/Molecular Mechanical (QM/MM) Docking: Development and Benchmark of a Scoring Function , 2014, J. Chem. Inf. Model..

[55]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[56]  Jacob D. Durrant,et al.  NNScore: A Neural-Network-Based Scoring Function for the Characterization of Protein−Ligand Complexes , 2010, J. Chem. Inf. Model..

[57]  Zhihai Liu,et al.  Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results , 2014, J. Chem. Inf. Model..

[58]  Jung-Hsin Lin,et al.  Scoring functions for prediction of protein-ligand interactions. , 2013, Current pharmaceutical design.

[59]  Eugene I. Shakhnovich,et al.  A Hybrid Knowledge-Based and Empirical Scoring Function for Protein-Ligand Interaction: SMoG2016 , 2017, J. Chem. Inf. Model..

[60]  Vijay S. Pande,et al.  Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity , 2017, ArXiv.

[61]  Izhar Wallach,et al.  AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery , 2015, ArXiv.

[62]  Shigenori Tanaka,et al.  AutoDock-GIST: Incorporating Thermodynamics of Active-Site Water into Scoring Function for Accurate Protein-Ligand Docking , 2016, Molecules.

[63]  Pedro J Ballester,et al.  Machine‐learning scoring functions to improve structure‐based binding affinity prediction and virtual screening , 2015, Wiley interdisciplinary reviews. Computational molecular science.

[64]  Thomas Lengauer,et al.  Evaluation of the FLEXX incremental construction algorithm for protein–ligand docking , 1999, Proteins.

[65]  Igor I. Baskin,et al.  Predicting Ligand Binding Modes from Neural Networks Trained on Protein-Ligand Interaction Fingerprints , 2013, J. Chem. Inf. Model..

[66]  Patrick M. Crill,et al.  Double‐counting challenges the accuracy of high‐latitude methane inventories , 2016 .

[67]  Gregory A Ross,et al.  Rapid and Accurate Prediction and Scoring of Water Molecules in Protein Binding Sites , 2012, PloS one.

[68]  Andrew R. Leach,et al.  Structure-based Drug Discovery , 2007 .

[69]  D. Ferguson,et al.  4 Molecular Modeling of Opioid Receptor-Ligand Complexes , 2002 .

[70]  Qian Wang,et al.  Epigenetic game theory: How to compute the epigenetic control of maternal-to-zygotic transition. , 2017, Physics of life reviews.

[71]  Michal Brylinski,et al.  Nonlinear Scoring Functions for Similarity-Based Ligand Docking and Binding Affinity Prediction , 2013, J. Chem. Inf. Model..

[72]  Walid Gomaa,et al.  Machine learning in computational docking , 2015, Artif. Intell. Medicine.

[73]  Arthur J. Olson,et al.  AutoDock4Zn: An Improved AutoDock Force Field for Small-Molecule Docking to Zinc Metalloproteins , 2014, J. Chem. Inf. Model..

[74]  F. Jørgensen,et al.  A new concept for multidimensional selection of ligand conformations (MultiSelect) and multidimensional scoring (MultiScore) of protein-ligand binding affinities. , 2001, Journal of medicinal chemistry.

[75]  Christoph A. Sotriffer,et al.  SFCscoreRF: A Random Forest-Based Scoring Function for Improved Affinity Prediction of Protein-Ligand Complexes , 2013, J. Chem. Inf. Model..

[76]  W. L. Jorgensen Efficient Drug Lead Discovery and Optimization , 2009 .

[77]  Stefano Forli,et al.  A force field with discrete displaceable waters and desolvation entropy for hydrated ligand docking. , 2012, Journal of medicinal chemistry.

[78]  John B. O. Mitchell,et al.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking , 2010, Bioinform..

[79]  Xavier Morelli,et al.  GFscore: A General Nonlinear Consensus Scoring Function for High-Throughput Docking , 2006, J. Chem. Inf. Model..

[80]  Marta M. Stepniewska-Dziubinska,et al.  Development and evaluation of a deep learning model for protein-ligand binding affinity prediction , 2017, 1712.07042.

[81]  Xiaoqin Zou,et al.  Inclusion of Solvation and Entropy in the Knowledge-Based Scoring Function for Protein-Ligand Interactions , 2010, J. Chem. Inf. Model..

[82]  Anna Marabotti,et al.  Simple, intuitive calculations of free energy of binding for protein-ligand complexes. 1. Models without explicit constrained water. , 2002, Journal of medicinal chemistry.

[83]  Nihar R. Mahapatra,et al.  BgN-Score and BsN-Score: Bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes , 2015, BMC Bioinformatics.

[84]  Markus H. J. Seifert,et al.  Virtual high-throughput screening of molecular databases. , 2007, Current opinion in drug discovery & development.

[85]  Ming Xiao,et al.  Lineage‐associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish‐based LAUPs analysis application (JBLA) , 2018, Bioinform..

[86]  Marta M. Stepniewska-Dziubinska,et al.  Development and evaluation of a deep learning model for protein–ligand binding affinity prediction , 2017, Bioinform..

[87]  David A Winkler,et al.  Nonlinear predictive modeling of MHC class II-peptide binding using Bayesian neural networks. , 2007, Methods in molecular biology.

[88]  Jian Wang,et al.  Characterization of Small Molecule Binding. I. Accurate Identification of Strong Inhibitors in Virtual Screening , 2013, J. Chem. Inf. Model..

[89]  Hans-Joachim Böhm,et al.  The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure , 1994, J. Comput. Aided Mol. Des..

[90]  I. Kuntz,et al.  Automated docking with grid‐based energy evaluation , 1992 .

[91]  Philip E. Bourne,et al.  A Machine Learning-Based Method To Improve Docking Scoring Functions and Its Application to Drug Repurposing , 2011, J. Chem. Inf. Model..

[92]  Alex Rubinsteyn,et al.  MHCflurry: Open-Source Class I MHC Binding Affinity Prediction. , 2018, Cell systems.

[93]  Zheng Zheng,et al.  Ligand Identification Scoring Algorithm (LISA) , 2011, J. Chem. Inf. Model..

[94]  Leonardo L. G. Ferreira,et al.  Molecular Docking and Structure-Based Drug Design Strategies , 2015, Molecules.

[95]  Johannes C. Hermann,et al.  Structure-based activity prediction for an enzyme of unknown function , 2007, Nature.

[96]  Kazuhiro Saitou,et al.  ROTAS: a rotamer-dependent, atomic statistical potential for assessment and prediction of protein structures , 2014, BMC Bioinformatics.

[97]  Bin Hu,et al.  Investigation of mechanism of bone regeneration in a porous biodegradable calcium phosphate (CaP) scaffold by a combination of a multi-scale agent-based model and experimental optimization/validation. , 2016, Nanoscale.

[98]  Oliver Kohlbacher,et al.  SLICK — Scoring and Energy Functions for Protein—Carbohydrate Interactions. , 2006 .

[99]  Christopher W. Murray,et al.  Empirical scoring functions. II. The testing of an empirical scoring function for the prediction of ligand-receptor binding affinities and the use of Bayesian regression to improve the quality of the model , 1998, J. Comput. Aided Mol. Des..

[100]  Yanli Wang,et al.  Structure-Based Virtual Screening for Drug Discovery: a Problem-Centric Review , 2012, The AAPS Journal.

[101]  Xiaoqin Zou,et al.  A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method , 2014, Nucleic acids research.

[102]  Hui-Yong Sun,et al.  Finding chemical drugs for genetic diseases. , 2014, Drug discovery today.

[103]  Jie Liu,et al.  Classification of Current Scoring Functions , 2015, J. Chem. Inf. Model..

[104]  Na Li,et al.  EZH2-, CHD4-, and IDH-linked epigenetic perturbation and its association with survival in glioma patients , 2017, Journal of molecular cell biology.

[105]  Minkyung Baek,et al.  GalaxyDock BP2 score: a hybrid scoring function for accurate protein–ligand docking , 2017, Journal of Computer-Aided Molecular Design.

[106]  Deok-Soo Kim,et al.  GalaxyDock2: Protein–ligand docking using beta‐complex and global optimization , 2013, J. Comput. Chem..

[107]  Jie Li,et al.  Comparative Assessment of Scoring Functions on an Updated Benchmark: 1. Compilation of the Test Set , 2014, J. Chem. Inf. Model..

[108]  Qi Zhao,et al.  Virtual screening approach to identifying influenza virus neuraminidase inhibitors using molecular docking combined with machine-learning-based scoring function , 2017, Oncotarget.

[109]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[110]  Badong Chen,et al.  Building Up a Robust Risk Mathematical Platform to Predict Colorectal Cancer , 2017, Complex..

[111]  Xiaoqin Zou,et al.  An iterative knowledge‐based scoring function to predict protein–ligand interactions: II. Validation of the scoring function , 2006, J. Comput. Chem..

[112]  Dan Li,et al.  Assessing the performance of the MM/PBSA and MM/GBSA methods. 6. Capability to predict protein-protein binding free energies and re-rank binding poses generated by protein-protein docking. , 2016, Physical chemistry chemical physics : PCCP.

[113]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[114]  David Ryan Koes,et al.  Protein-Ligand Scoring with Convolutional Neural Networks , 2016, Journal of chemical information and modeling.

[115]  Cornel Catana,et al.  Novel, Customizable Scoring Functions, Parameterized Using N-PLS, for Structure-Based Drug Discovery , 2007, J. Chem. Inf. Model..

[116]  Bing Wang,et al.  The role of quantum mechanics in structure-based drug design. , 2007, Drug discovery today.

[117]  Max Deuring Zur Theorie der Moduln algebraischer Funktionenkörper , 1942 .

[118]  Cheng Wang,et al.  Improving scoring‐docking‐screening powers of protein–ligand scoring functions using random forest , 2017, J. Comput. Chem..

[119]  Gisbert Schneider,et al.  Virtual screening: an endless staircase? , 2010, Nature Reviews Drug Discovery.

[120]  SHENG-YOU HUANG,et al.  An iterative knowledge‐based scoring function to predict protein–ligand interactions: I. Derivation of interaction potentials , 2006, J. Comput. Chem..

[121]  Lin-Li Li,et al.  ID-Score: A New Empirical Scoring Function Based on a Comprehensive Set of Descriptors Related to Protein-Ligand Interactions , 2013, J. Chem. Inf. Model..

[122]  Pavan Reddy,et al.  Recent Advances and Future Directions , 2004 .

[123]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[124]  Nihar R. Mahapatra,et al.  Boosted neural networks scoring functions for accurate ligand docking and ranking , 2018, J. Bioinform. Comput. Biol..