Predicting kinase inhibitors using bioactivity matrix derived informer sets

Prediction of compounds that are active against a desired biological target is a common step in drug discovery efforts. Virtual screening methods seek some active-enriched fraction of a library for experimental testing. Where data are too scarce to train supervised learning models for compound prioritization, initial screening must provide the necessary data. Commonly, such an initial library is selected on the basis of chemical diversity by some pseudo-random process (for example, the first few plates of a larger library) or by selecting an entire smaller library. These approaches may not produce a sufficient number or diversity of actives. An alternative approach is to select an informer set of screening compounds on the basis of chemogenomic information from previous testing of compounds against a large number of targets. We compare different ways of using chemogenomic data to choose a small informer set of compounds based on previously measured bioactivity data. We develop this Informer-Based-Ranking (IBR) approach using the Published Kinase Inhibitor Sets (PKIS) as the chemogenomic data to select the informer sets. We test the informer compounds on a target that is not part of the chemogenomic data, then predict the activity of the remaining compounds based on the experimental informer data and the chemogenomic data. Through new chemical screening experiments, we demonstrate the utility of IBR strategies in a prospective test on three kinase targets not included in the PKIS.

[1]  Juho Rousu,et al.  Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors , 2017, PLoS Comput. Biol..

[2]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[3]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[4]  Hanna Geppert,et al.  Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation , 2010, J. Chem. Inf. Model..

[5]  P. Clemons,et al.  Chemogenomic data analysis: prediction of small-molecule targets and the advent of biological fingerprint. , 2007, Combinatorial chemistry & high throughput screening.

[6]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[7]  Stephen J. Wright,et al.  PMU Placement for Line Outage Identification via Multinomial Logistic Regression , 2018, IEEE Transactions on Smart Grid.

[8]  Matthieu Schapira,et al.  Identification of small molecule inhibitors that block the Toxoplasma gondii rhoptry kinase ROP18. , 2016, ACS infectious diseases.

[9]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10]  Stephen J. Capuzzi,et al.  Progress towards a public chemogenomic set for protein kinases and a call for contributions , 2017, bioRxiv.

[11]  Tudor I. Oprea,et al.  Integrating virtual screening in lead discovery. , 2004, Current opinion in chemical biology.

[12]  Anne Mai Wassermann,et al.  Experimental Design Strategy: Weak Reinforcement Leads to Increased Hit Rates and Enhanced Chemical Diversity , 2015, J. Chem. Inf. Model..

[13]  A. Bender,et al.  Analysis of Iterative Screening with Stepwise Compound Selection Based on Novartis In-house HTS Data. , 2016, ACS chemical biology.

[14]  Roman Garnett,et al.  Introducing the ‘active search’ method for iterative virtual screening , 2015, Journal of Computer-Aided Molecular Design.

[15]  Chang Liu,et al.  Predicting Drug–Target Interactions Using Probabilistic Matrix Factorization , 2013, J. Chem. Inf. Model..

[16]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[17]  Yuan Wang,et al.  Using Information from Historical High-Throughput Screens to Predict Active Compounds , 2014, J. Chem. Inf. Model..

[18]  Anne Mai Wassermann,et al.  Ligand Prediction for Orphan Targets Using Support Vector Machines and Various Target-Ligand Kernels Is Dominated by Nearest Neighbor Effects , 2009, J. Chem. Inf. Model..

[19]  Gerard J P van Westen,et al.  Drug Discovery Maps, a Machine Learning Model That Visualizes and Predicts Kinome–Inhibitor Interaction Landscapes , 2018, J. Chem. Inf. Model..

[20]  Nicole E. Bodycombe,et al.  Connecting Small Molecules with Similar Assay Performance Profiles Leads to New Biological Hypotheses , 2014, Journal of biomolecular screening.

[21]  Anne Mai Wassermann,et al.  Public Domain HTS Fingerprints: Design and Evaluation of Compound Bioactivity Profiles from PubChem's Bioassay Repository , 2016, J. Chem. Inf. Model..

[22]  G. Schneider,et al.  Active learning for computational chemogenomics. , 2017, Future medicinal chemistry.

[23]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[24]  P. Petrone,et al.  Aggregated Compound Biological Signatures Facilitate Phenotypic Drug Discovery and Target Elucidation. , 2016, ACS chemical biology.

[25]  Huikun Zhang,et al.  Machine Learning Consensus Scoring Improves Performance Across Targets in Structure-Based Virtual Screening , 2017, J. Chem. Inf. Model..

[26]  J. Bajorath,et al.  Recent Advances in Scaffold Hopping. , 2017, Journal of medicinal chemistry.

[27]  E. Lionta,et al.  Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances , 2014, Current topics in medicinal chemistry.

[28]  Peter S. Kutchukian,et al.  Rethinking molecular similarity: comparing compounds on the basis of biological activity. , 2012, ACS chemical biology.

[29]  Y. Martin,et al.  Do structurally similar molecules have similar biological activity? , 2002, Journal of medicinal chemistry.

[30]  Andreas Bender,et al.  Discovering Highly Potent Molecules from an Initial Set of Inactives Using Iterative Screening , 2018, J. Chem. Inf. Model..

[31]  T. Willson,et al.  Seeding Collaborations to Advance Kinase Science with the GSK Published Kinase Inhibitor Set (PKIS) , 2014, Current topics in medicinal chemistry.

[32]  Michael J. Keiser,et al.  Predicted Biological Activity of Purchasable Chemical Space , 2017, J. Chem. Inf. Model..

[33]  John P. Overington,et al.  Comprehensive characterization of the Published Kinase Inhibitor Set , 2016, Nature Biotechnology.

[34]  Andreas Bender,et al.  Data-Driven Derivation of an "Informer Compound Set" for Improved Selection of Active Compounds in High-Throughput Screening , 2016, J. Chem. Inf. Model..

[35]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[36]  Christin Rakers,et al.  Chemogenomic Active Learning's Domain of Applicability on Small, Sparse qHTS Matrices: A Study Using Cytochrome P450 and Nuclear Hormone Receptor Families , 2018, ChemMedChem.

[37]  Anthony Nicholls,et al.  What do we know and when do we know it? , 2008, J. Comput. Aided Mol. Des..

[38]  M. Stahl,et al.  Scaffold hopping. , 2004, Drug discovery today. Technologies.

[39]  Edward W. Lowe,et al.  Computational Methods in Drug Discovery , 2014, Pharmacological Reviews.

[40]  Robin Taylor,et al.  Simulation Analysis of Experimental Design Strategies for Screening Random Compounds as Potential New Drugs and Agrochemicals , 1995, J. Chem. Inf. Comput. Sci..

[41]  D. Drewry,et al.  In Silico Screen and Structural Analysis Identifies Bacterial Kinase Inhibitors which Act with β-Lactams To Inhibit Mycobacterial Growth. , 2018, Molecular pharmaceutics.