Predicting Binding Affinity of CSAR Ligands Using Both Structure-Based and Ligand-Based Approaches

We report on the prediction accuracy of ligand-based (2D QSAR) and structure-based (MedusaDock) methods used both independently and in consensus for ranking the congeneric series of ligands binding to three protein targets (UK, ERK2, and CHK1) from the CSAR 2011 benchmark exercise. An ensemble of predictive QSAR models was developed using known binders of these three targets extracted from the publicly available ChEMBL database. Selected models were used to predict the binding affinity of CSAR compounds toward the corresponding targets and rank them accordingly; the overall ranking accuracy evaluated by Spearman correlation was as high as 0.78 for UK, 0.60 for ERK2, and 0.56 for CHK1, placing our predictions in the top 10% among all the participants. In parallel, MedusaDock, designed to predict reliable docking poses, was also used for ranking the CSAR ligands according to their docking scores; the resulting accuracy (Spearman correlation) for UK, ERK2, and CHK1 were 0.76, 0.31, and 0.26, respectively. In addition, performance of several consensus approaches combining MedusaDock- and QSAR-predicted ranks altogether has been explored; the best approach yielded Spearman correlation coefficients for UK, ERK2, and CHK1 of 0.82, 0.50, and 0.45, respectively. This study shows that (i) externally validated 2D QSAR models were capable of ranking CSAR ligands at least as accurately as more computationally intensive structure-based approaches used both by us and by other groups and (ii) ligand-based QSAR models can complement structure-based approaches by boosting the prediction performances when used in consensus.

[1]  Eugene N Muratov,et al.  Per aspera ad astra: application of Simplex QSAR approach in antiviral research. , 2010, Future medicinal chemistry.

[2]  Victor Kuzmin,et al.  Hierarchical QSAR technology based on the Simplex representation of molecular structure , 2008, J. Comput. Aided Mol. Des..

[3]  Nikolay V. Dokholyan,et al.  Combined Application of Cheminformatics- and Physical Force Field-Based Scoring Functions Improves Binding Affinity Prediction for CSAR Data Sets , 2011, J. Chem. Inf. Model..

[4]  Alexander Golbraikh,et al.  Predictive QSAR modeling workflow, model applicability domains, and virtual screening. , 2007, Current pharmaceutical design.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Ruben Abagyan,et al.  Comparative study of several algorithms for flexible ligand docking , 2003, J. Comput. Aided Mol. Des..

[7]  Victor Kuzmin,et al.  Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity , 2009, J. Chem. Inf. Model..

[8]  Feng Ding,et al.  Rapid Flexible Docking Using a Stochastic Rotamer Library of Ligands , 2010, J. Chem. Inf. Model..

[9]  Aleksey A. Porollo,et al.  Survey of public domain software for docking simulations and virtual screening , 2011, Human Genomics.

[10]  Luciana Scotti,et al.  SAR, QSAR and docking of anticancer flavonoids and variants: a review. , 2013, Current topics in medicinal chemistry.

[11]  Nikolay V. Dokholyan,et al.  MedusaScore: An Accurate Force Field-Based Scoring Function for Virtual Drug Screening , 2008, J. Chem. Inf. Model..

[12]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[13]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[14]  Gerald M. Maggiora,et al.  On Outliers and Activity Cliffs-Why QSAR Often Disappoints , 2006, J. Chem. Inf. Model..

[15]  Feng Ding,et al.  Incorporating Backbone Flexibility in MedusaDock Improves Ligand-Binding Pose Prediction in the CSAR2011 Docking Benchmark , 2013, J. Chem. Inf. Model..

[16]  E. Muratov,et al.  Quantitative structure-activity relationship studies of [(biphenyloxy)propyl]isoxazole derivatives. Inhibitors of human rhinovirus 2 replication. , 2007, Journal of medicinal chemistry.

[17]  William L. Jorgensen,et al.  Journal of Chemical Information and Modeling , 2005, J. Chem. Inf. Model..

[18]  B. Shoichet,et al.  Molecular docking and ligand specificity in fragment-based inhibitor discovery. , 2009, Nature chemical biology.

[19]  A Tropsha,et al.  QSAR analysis of the toxicity of nitroaromatics in Tetrahymena pyriformis: structural factors and possible modes of action , 2011, SAR and QSAR in environmental research.

[20]  P Wutzler,et al.  Identification of individual structural fragments of N,N'-(bis-5-nitropyrimidyl)dispirotripiperazine derivatives for cytotoxicity and antiherpetic activity allows the prediction of new highly active compounds. , 2007, The Journal of antimicrobial chemotherapy.

[21]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[22]  M. Ghate,et al.  2D, 3D-QSAR and docking studies of 1,2,3-thiadiazole thioacetanilides analogues as potent HIV-1 non-nucleoside reverse transcriptase inhibitors , 2012, Organic and medicinal chemistry letters.

[23]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[24]  Alexander Tropsha,et al.  Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research , 2010, J. Chem. Inf. Model..

[25]  Rainer Metternich,et al.  The future of medicinal chemistry. , 2012, Angewandte Chemie.

[26]  Richard D. Smith,et al.  CSAR Benchmark Exercise 2011–2012: Evaluation of Results from Docking and Relative Ranking of Blinded Congeneric Series , 2013, J. Chem. Inf. Model..

[27]  Alexander Tropsha,et al.  Best Practices for QSAR Model Development, Validation, and Exploitation , 2010, Molecular informatics.