Polypharmacology Within the Full Kinome: a Machine Learning Approach

Protein kinases generate nearly a thousand different protein products and regulate the majority of cellular pathways and signal transduction. It is therefore not surprising that the deregulation of kinases has been implicated in many disease states. In fact, kinase inhibitors are the largest class of new cancer therapies. Understanding polypharmacology within the full kinome, how drugs interact with many different kinases, would allow for the development of safer and more efficacious cancer therapies. A full understanding of these interactions is not experimentally feasible making highly accurate computational predictions extremely useful and important. This work aims at making a machine learning model useful for investigating the full kinome. We evaluate many feature sets for our model and get better performance over molecular docking with all of them. We demonstrate that you can achieve a nearly 60% increase in success rate at identifying binding compounds using our model over molecular docking scores.

[1]  Ali Anaissi,et al.  A balanced iterative random forest for gene selection from microarray data , 2013, BMC Bioinformatics.

[2]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[3]  David Hoksza,et al.  Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features , 2015, Journal of Cheminformatics.

[4]  H. Kitano,et al.  Combining Machine Learning Systems and Multiple Docking Simulation Packages to Improve Docking Prediction Reliability for Network Pharmacology , 2013, PloS one.

[5]  Gareth J Waldron,et al.  Reducing safety-related drug attrition: the use of in vitro pharmacological profiling , 2012, Nature Reviews Drug Discovery.

[6]  Jeremy C. Smith,et al.  VinaMPI: Facilitating multiple receptor high‐throughput virtual docking on high‐performance computers , 2013, J. Comput. Chem..

[7]  R D Appel,et al.  Protein identification and analysis tools in the ExPASy server. , 1999, Methods in molecular biology.

[8]  Jun Liang,et al.  Discovery of potent and cell-active allosteric dual Akt 1 and 2 inhibitors. , 2008, Bioorganic & medicinal chemistry letters.

[9]  Bjoern H. Menze,et al.  A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data , 2009, BMC Bioinformatics.

[10]  Gilles Louppe,et al.  Understanding variable importances in forests of randomized trees , 2013, NIPS.

[11]  Philip E. Bourne,et al.  A Machine Learning-Based Method To Improve Docking Scoring Functions and Its Application to Drug Repurposing , 2011, J. Chem. Inf. Model..

[12]  Irene Kwan,et al.  Does animal experimentation inform human healthcare? Observations from a systematic review of international animal experiments on fluid resuscitation , 2002, BMJ : British Medical Journal.

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  Weiping Chen,et al.  A protein network descriptor server and its use in studying protein, disease, metabolic and drug targeted networks , 2016, Briefings Bioinform..

[15]  Gianluca Pollastri,et al.  Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility , 2013, Bioinform..

[16]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[17]  Michael M. Mysinger,et al.  Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking , 2012, Journal of medicinal chemistry.

[18]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[19]  Jiuyong Li,et al.  DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins. , 2016, Drug discovery today.

[20]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..