CSM-lig: a web server for assessing and comparing protein–small molecule affinities

Determining the affinity of a ligand for a given protein is a crucial component of drug development and understanding their biological effects. Predicting binding affinities is a challenging and difficult task, and despite being regarded as poorly predictive, scoring functions play an important role in the analysis of molecular docking results. Here, we present CSM-Lig (http://structure.bioc.cam.ac.uk/csm_lig), a web server tailored to predict the binding affinity of a protein-small molecule complex, encompassing both protein and small-molecule complementarity in terms of shape and chemistry via graph-based structural signatures. CSM-Lig was trained and evaluated on different releases of the PDBbind databases, achieving a correlation of up to 0.86 on 10-fold cross validation and 0.80 in blind tests, performing as well as or better than other widely used methods. The web server allows users to rapidly and automatically predict binding affinities of collections of structures and assess the interactions made. We believe CSM-lig would be an invaluable tool for helping assess docking poses, the effects of multiple mutations, including insertions, deletions and alternative splicing events, in protein-small molecule affinity, unraveling important aspects that drive protein–compound recognition.

[1]  Tarun Jain,et al.  An all atom energy based computational protocol for predicting binding affinities of protein–ligand complexes , 2005, FEBS letters.

[2]  M. Parker,et al.  Crystal structure of human insulin‐regulated aminopeptidase with specificity for cyclic peptides , 2015, Protein science : a publication of the Protein Society.

[3]  Wagner Meira,et al.  Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins , 2009, Proteins.

[4]  D. E. Clark,et al.  Flexible docking using tabu search and an empirical estimate of binding affinity , 1998, Proteins.

[5]  Jie Li,et al.  PDB-wide collection of binding data: current status of the PDBbind database , 2015, Bioinform..

[6]  Douglas E. V. Pires,et al.  pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures , 2015, Journal of medicinal chemistry.

[7]  G. Klebe,et al.  DrugScore(CSD)-knowledge-based scoring function derived from small molecule crystal data with superior recognition rate of near-native ligand poses and better affinity prediction. , 2005, Journal of medicinal chemistry.

[8]  Alexander D. MacKerell,et al.  CHARMM general force field: A force field for drug‐like molecules compatible with the CHARMM all‐atom additive biological force fields , 2009, J. Comput. Chem..

[9]  Brian K. Shoichet,et al.  Statistical Potential for Modeling and Ranking of Protein-Ligand Interactions , 2011, J. Chem. Inf. Model..

[10]  David S. Goodsell,et al.  A semiempirical free energy force field with charge‐based desolvation , 2007, J. Comput. Chem..

[11]  Luhua Lai,et al.  Further development and validation of empirical scoring functions for structure-based binding affinity prediction , 2002, J. Comput. Aided Mol. Des..

[12]  Marcel L Verdonk,et al.  General and targeted statistical potentials for protein–ligand interactions , 2005, Proteins.

[13]  Tom L. Blundell,et al.  Does a More Precise Chemical Description of Protein–Ligand Complexes Lead to More Accurate Prediction of Binding Affinity? , 2014, J. Chem. Inf. Model..

[14]  C. Venkatachalam,et al.  LigScore: a novel scoring function for predicting binding affinities. , 2005, Journal of molecular graphics & modelling.

[15]  Michael W Parker,et al.  Identification of modulating residues defining the catalytic cleft of insulin-regulated aminopeptidase. , 2008, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[16]  Tarun Jain,et al.  Computational protocol for predicting the binding affinities of zinc containing metalloprotein–ligand complexes , 2007, Proteins.

[17]  Junmei Wang,et al.  Development and testing of a general amber force field , 2004, J. Comput. Chem..

[18]  Wagner Meira,et al.  Cutoff Scanning Matrix (CSM): structural classification and function prediction by protein inter-residue distance patterns , 2011, BMC Genomics.

[19]  Jie Li,et al.  Comparative Assessment of Scoring Functions on an Updated Benchmark: 1. Compilation of the Test Set , 2014, J. Chem. Inf. Model..

[20]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[21]  Y. Martin,et al.  A general and fast scoring function for protein-ligand interactions: a simplified potential approach. , 1999, Journal of medicinal chemistry.

[22]  Abby L. Parrill,et al.  Rational drug design : novel methodology and practical applications , 1999 .

[23]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[24]  Jerome Wielens,et al.  Potent hepatitis C inhibitors bind directly to NS5A and reduce its affinity for RNA , 2014, Scientific Reports.

[25]  John B. O. Mitchell,et al.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking , 2010, Bioinform..

[26]  Ajay N. Jain Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities , 1996, J. Comput. Aided Mol. Des..

[27]  Michael W Parker,et al.  Identification and characterization of a new cognitive enhancer based on inhibition of insulin‐regulated aminopeptidase , 2008, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[28]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[29]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[30]  Jessica Holien,et al.  Improvements, trends, and new ideas in molecular docking: 2012–2013 in review , 2015, Journal of molecular recognition : JMR.

[31]  Matthew P. Repasky,et al.  Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. , 2006, Journal of medicinal chemistry.

[32]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[33]  Hans-Joachim Böhm,et al.  Prediction of binding constants of protein ligands: A fast method for the prioritization of hits obtained from de novo design or 3D database search programs , 1998, J. Comput. Aided Mol. Des..

[34]  Zhihai Liu,et al.  Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results , 2014, J. Chem. Inf. Model..

[35]  Michael W Parker,et al.  Development of cognitive enhancers based on inhibition of insulin-regulated aminopeptidase , 2008, BMC Neuroscience.

[36]  Douglas E. V. Pires,et al.  mCSM: predicting the effects of mutations in proteins using graph-based signatures , 2013, Bioinform..

[37]  Wagner Meira,et al.  aCSM: noise-free graph-based signatures to large-scale receptor-based ligand prediction , 2013, Bioinform..

[38]  Xiaoqin Zou,et al.  Challenges, Applications, and Recent Advances of Protein-Ligand Docking in Structure-Based Drug Design , 2014, Molecules.

[39]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[40]  Michal Brylinski,et al.  Nonlinear Scoring Functions for Similarity-Based Ligand Docking and Binding Affinity Prediction , 2013, J. Chem. Inf. Model..

[41]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[42]  I. Muegge PMF scoring revisited. , 2006, Journal of medicinal chemistry.