A Hybrid Structure-Based Machine Learning Approach for Predicting Kinase Inhibition by Small Molecules

Kinases have been the focus of drug discovery programs for three decades leading to over 70 therapeutic kinase inhibitors and biophysical affinity measurements for over 130,000 kinase-compound pairs. Nonetheless, the precise target spectrum for many kinases remains only partly understood. In this study, we describe a computational approach to unlocking qualitative and quantitative kinome-wide binding measurements for structure-based machine learning. Our study has three components: (i) a Kinase Inhibitor Complex (KinCo) data set comprising in silico predicted kinase structures paired with experimental binding constants, (ii) a machine learning loss function that integrates qualitative and quantitative data for model training, and (iii) a structure-based machine learning model trained on KinCo. We show that our approach outperforms methods trained on crystal structures alone in predicting binary and quantitative kinase-compound interaction affinities; relative to structure-free methods, our approach also captures known kinase biochemistry and more successfully generalizes to distant kinase sequences and compound scaffolds.

[1]  Zhengxiao Wei,et al.  Sfcnn: a novel scoring function based on 3D convolutional neural network for accurate and stable protein–ligand affinity prediction , 2022, BMC Bioinformatics.

[2]  P. Sorger,et al.  Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms , 2021, Nature Methods.

[3]  José L. Medina-Franco,et al.  Grand Challenges of Computer-Aided Drug Design: The Road Ahead , 2021, Frontiers in Drug Discovery.

[4]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[5]  M. Taiji,et al.  Molecular Dynamics Study of Conformational Changes of Tankyrase 2 Binding Subsites upon Ligand Binding , 2021, ACS omega.

[6]  Sun Kim,et al.  A review on compound-protein interaction prediction methods: Data, format, representation and model , 2021, Computational and structural biotechnology journal.

[7]  D. Baker,et al.  Force Field Optimization Guided by Small Molecule Crystal Lattice Data Enables Consistent Sub-Angstrom Protein-Ligand Docking. , 2021, Journal of chemical theory and computation.

[8]  David Ryan Koes,et al.  3D Convolutional Neural Networks and a CrossDocked Dataset for Structure-Based Drug Design. , 2020, Journal of chemical information and modeling.

[9]  Riccardo Alessandri,et al.  Protein–ligand binding with the coarse-grained Martini model , 2020, Nature Communications.

[10]  Dan Zhao,et al.  MONN: A Multi-objective Neural Network for Predicting Compound-Protein Interactions and Affinities , 2020, Cell Systems.

[11]  Y. Okuno,et al.  Exploring ligand binding pathways on proteins using hypersound-accelerated molecular dynamics , 2020, Nature Communications.

[12]  Prema K V,et al.  Machine learning in drug-target interaction prediction: current state and future directions. , 2020, Drug discovery today.

[13]  Crizotinib , 2020, Reactions Weekly.

[14]  Georgi K. Kanev,et al.  The Landscape of Atypical and Eukaryotic Protein Kinases. , 2019, Trends in pharmacological sciences.

[15]  Sarah A. Boswell,et al.  Multiomics Profiling Establishes the Polypharmacology of FDA-Approved CDK4/6 Inhibitors and the Potential for Differential Clinical Activity. , 2019, Cell chemical biology.

[16]  Mario Medvedovic,et al.  Cheminformatics Tools for Analyzing and Designing Optimized Small-Molecule Collections and Libraries. , 2019, Cell chemical biology.

[17]  Viktor Hornak,et al.  Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening , 2019, PloS one.

[18]  Ailing Fu,et al.  An Overview of Scoring Functions Used for Protein–Ligand Interactions in Molecular Docking , 2019, Interdisciplinary Sciences: Computational Life Sciences.

[19]  Mohammed AlQuraishi,et al.  ProteinNet: a standardized data set for machine learning of protein structure , 2019, BMC Bioinformatics.

[20]  Russ B. Altman,et al.  Graph Convolutional Neural Networks for Predicting Drug-Target Interactions , 2018, bioRxiv.

[21]  Michael Wainberg,et al.  Deep learning in biomedicine , 2018, Nature Biotechnology.

[22]  Di Wu,et al.  DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks , 2018, bioRxiv.

[23]  Yang Li,et al.  PotentialNet for Molecular Property Prediction , 2018, ACS central science.

[24]  Arzucan Özgür,et al.  DeepDTA: deep drug–target binding affinity prediction , 2018, Bioinform..

[25]  Gianni De Fabritiis,et al.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks , 2018, J. Chem. Inf. Model..

[26]  John P. Overington,et al.  Drug Target Commons: A Community Effort to Build a Consensus Knowledge Base for Drug-Target Interactions , 2017, Cell chemical biology.

[27]  Marta M. Stepniewska-Dziubinska,et al.  Development and evaluation of a deep learning model for protein–ligand binding affinity prediction , 2017, Bioinform..

[28]  Pedro J. Ballester,et al.  Performance of machine-learning scoring functions in structure-based virtual screening , 2017, Scientific Reports.

[29]  Eugene I. Shakhnovich,et al.  A Hybrid Knowledge-Based and Empirical Scoring Function for Protein-Ligand Interaction: SMoG2016 , 2017, J. Chem. Inf. Model..

[30]  Zhihai Liu,et al.  Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions. , 2017, Accounts of chemical research.

[31]  S. Knapp,et al.  The ins and outs of selective kinase inhibitor development. , 2015, Nature chemical biology.

[32]  Pedro J Ballester,et al.  Machine‐learning scoring functions to improve structure‐based binding affinity prediction and virtual screening , 2015, Wiley interdisciplinary reviews. Computational molecular science.

[33]  Zhihai Liu,et al.  Cross‐Mapping of Protein – Ligand Binding Data Between ChEMBL and PDBbind , 2015, Molecular informatics.

[34]  Chee Keong Kwoh,et al.  Fast, accurate, and reliable molecular docking with QuickVina 2 , 2015, Bioinform..

[35]  John D. Chodera,et al.  Ensembler: Enabling High-Throughput Molecular Simulations at the Superfamily Scale , 2015, bioRxiv.

[36]  Jianzhu Ma,et al.  Protein structure alignment beyond spatial proximity , 2013, Scientific Reports.

[37]  Mindy I. Davis,et al.  Comprehensive analysis of kinase inhibitor selectivity , 2011, Nature Biotechnology.

[38]  Jacob D. Durrant,et al.  NNScore 2.0: A Neural-Network Receptor–Ligand Scoring Function , 2011, J. Chem. Inf. Model..

[39]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[40]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[41]  P. Hajduk,et al.  Navigating the kinome. , 2011, Nature chemical biology.

[42]  Gerhard Klebe,et al.  Fconv: Format Conversion, Manipulation and Feature Computation of Molecular Data , 2011, Bioinform..

[43]  John B. O. Mitchell,et al.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking , 2010, Bioinform..

[44]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[45]  Alexander D. MacKerell,et al.  Computational evaluation of protein-small molecule binding. , 2009, Current opinion in structural biology.

[46]  Chuong B Do,et al.  What is the expectation maximization algorithm? , 2008, Nature Biotechnology.

[47]  Bernhard Kuster,et al.  Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitors , 2007, Nature Biotechnology.

[48]  M. Gilson,et al.  Calculation of protein-ligand binding affinities. , 2007, Annual review of biophysics and biomolecular structure.

[49]  Helge Weissig,et al.  Functional interrogation of the kinome using nucleotide acyl phosphates. , 2007, Biochemistry.

[50]  Ben M. Webb,et al.  Comparative Protein Structure Modeling Using Modeller , 2006, Current protocols in bioinformatics.

[51]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[52]  L. Wodicka,et al.  A small molecule–kinase interaction map for clinical kinase inhibitors , 2005, Nature Biotechnology.

[53]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[54]  T. Hunter,et al.  Protein kinases and phosphatases: The Yin and Yang of protein phosphorylation and signaling , 1995, Cell.

[55]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[56]  N. Gray,et al.  Targeting cancer with small molecule kinase inhibitors , 2009, Nature Reviews Cancer.

[57]  Mindy I. Davis,et al.  A quantitative analysis of kinase inhibitor selectivity , 2008, Nature Biotechnology.