Predicting protein targets for drug-like compounds using transcriptomics

An expanded chemical space is essential for improved identification of small molecules for emerging therapeutic targets. However, the identification of targets for novel compounds is biased towards the synthesis of known scaffolds that bind familiar protein families, limiting the exploration of chemical space. To change this paradigm, we validated a new pipeline that identifies small molecule-protein interactions and works even for compounds lacking similarity to known drugs. Based on differential mRNA profiles in multiple cell types exposed to drugs and in which gene knockdowns (KD) were conducted, we showed that drugs induce gene regulatory networks that correlate with those produced after silencing protein-coding genes. Next, we applied supervised machine learning to exploit drug-KD signature correlations and enriched our predictions using an orthogonal structure-based screen. As a proof-of-principle for this regimen, top-10/top-100 target prediction accuracies of 26% and 41%, respectively, were achieved on a validation of set 152 FDA-approved drugs and 3104 potential targets. We then predicted targets for 1680 compounds and validated chemical interactors with four targets that have proven difficult to chemically modulate, including non-covalent inhibitors of HRAS and KRAS. Importantly, drug-target interactions manifest as gene expression correlations between drug treatment and both target gene KD and KD of genes that act up- or down-stream of the target, even for relatively weak binders. These correlations provide new insights on the cellular response of disrupting protein interactions and highlight the complex genetic phenotypes of drug treatment. With further refinement, our pipeline may accelerate the identification and development of novel chemical classes by screening compound-target interactions.

[1]  R. Tagliaferri,et al.  Discovery of drug mode of action and drug repositioning from transcriptional responses , 2010, Proceedings of the National Academy of Sciences.

[2]  Carlos J. Camacho,et al.  Optimal strategies for virtual screening of induced-fit and flexible target in the 2015 D3R Grand Challenge , 2016, Journal of Computer-Aided Molecular Design.

[3]  T. Owa [Drug target validation and identification of secondary drug target effects using DNA microarrays]. , 2007, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[4]  Kai Huang,et al.  PharmMapper server: a web server for potential drug target identification using pharmacophore mapping approach , 2010, Nucleic Acids Res..

[5]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[6]  P. Clemons,et al.  Target identification and mechanism of action in chemical biology and drug discovery. , 2013, Nature chemical biology.

[7]  Marc A. Martí-Renom,et al.  Ligand-Target Prediction by Structural Network Biology Using nAnnoLyze , 2015, PLoS Comput. Biol..

[8]  David Ryan Koes,et al.  ZINCPharmer: pharmacophore search of the ZINC database , 2012, Nucleic Acids Res..

[9]  K. Shokat,et al.  Direct small-molecule inhibitors of KRAS: from structural insights to mechanism-based design , 2016, Nature Reviews Drug Discovery.

[10]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[11]  A. Prescott,et al.  Structural insights into the regulation of PDK1 by phosphoinositides and inositol phosphates , 2004, The EMBO journal.

[12]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[13]  M. Ghosh,et al.  A CHIPotle in physiology and disease. , 2015, The international journal of biochemistry & cell biology.

[14]  J. Irwin,et al.  Identifying mechanism-of-action targets for drugs and probes , 2012, Proceedings of the National Academy of Sciences.

[15]  D. Cyr,et al.  The Hsc70 co-chaperone CHIP targets immature CFTR for proteasomal degradation , 2000, Nature Cell Biology.

[16]  Lorenz M Mayr,et al.  Novel trends in high-throughput screening. , 2009, Current opinion in pharmacology.

[17]  C. Der,et al.  Transforming genes of human bladder and lung carcinoma cell lines are homologous to the ras genes of Harvey and Kirsten sarcoma viruses. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[18]  B. Vanhaesebroeck,et al.  The PI3K-PDK1 connection: more than just a road to PKB. , 2000, The Biochemical journal.

[19]  Eric D. Kolaczyk,et al.  Predicting gene targets of perturbations via network-based filtering of mRNA expression compendia , 2008, Bioinform..

[20]  Dima Kozakov,et al.  The FTMap family of web servers for determining and characterizing ligand-binding hot spots of proteins , 2015, Nature Protocols.

[21]  N. A. Temiz,et al.  Novel modulation factor quantifies the role of water molecules in protein interactions , 2010, Proteins.

[22]  Maria Deak,et al.  Identification of a pocket in the PDK1 kinase domain that interacts with PIF and the C‐terminal residues of PKA , 2000, The EMBO journal.

[23]  T. Harris,et al.  Role of the PH domain in regulating in vitro autophosphorylation events required for reconstitution of PDK1 catalytic activity. , 2006, Bioorganic chemistry.

[24]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[25]  Yong Huang,et al.  Large-Scale Chemical Similarity Networks for Target Profiling of Compounds Identified in Cell-Based Chemical Screens , 2015, PLoS Comput. Biol..

[26]  P. Cohen,et al.  Chaperoned ubiquitylation--crystal structures of the CHIP U box E3 ubiquitin ligase and a CHIP-Ubc13-Uev1a complex. , 2005, Molecular cell.

[27]  David Ryan Koes,et al.  A Teach-Discover-Treat Application of ZincPharmer: An Online Interactive Pharmacophore Modeling and Virtual Screening Tool , 2015, PloS one.

[28]  Justin Lamb,et al.  The Connectivity Map: a new tool for biomedical research , 2007, Nature Reviews Cancer.

[29]  Jennifer M. Rust,et al.  The BioGRID Interaction Database , 2011 .

[30]  Natalie Wilson Human Protein Reference Database , 2004, Nature Reviews Genetics.

[31]  Matthew E. Welsch,et al.  Multivalent Small-Molecule Pan-RAS Inhibitors , 2017, Cell.

[32]  Jeremiah J. Faith,et al.  Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata , 2007, Nucleic Acids Res..

[33]  John G Doench,et al.  Kinase requirements in human cells: I. Comparing kinase requirements across various cell types , 2008, Proceedings of the National Academy of Sciences.

[34]  M. Schroeder,et al.  Drug target prioritization by perturbed gene expression and network information , 2015, Scientific Reports.

[35]  Gary D Bader,et al.  The human genome and drug discovery after a decade. Roads (still) not taken , 2011, 1102.0448.

[36]  Carlos J. Camacho,et al.  Choosing the Optimal Rigid Receptor for Docking and Scoring in the CSAR 2013/2014 Experiment , 2016, J. Chem. Inf. Model..

[37]  David Ryan Koes,et al.  Lessons Learned in Empirical Scoring with smina from the CSAR 2011 Benchmarking Exercise , 2013, J. Chem. Inf. Model..

[38]  A. J. Bain,et al.  Homodimerization in Live Cells Regulation of 3-Phosphoinositide-Dependent Protein Kinase 1 Activity by ` , 2010 .

[39]  Suzanne Schubbert,et al.  Hyperactive Ras in developmental disorders and cancer , 2007, Nature Reviews Cancer.

[40]  Y. Moreau,et al.  Finding the targets of a drug by integration of gene expression data with a protein interaction network. , 2013, Molecular bioSystems.

[41]  Jung-Hsin Lin,et al.  idTarget: a web server for identifying protein targets of small chemical molecules with robust scoring functions and a divide-and-conquer docking approach , 2012, Nucleic Acids Res..

[42]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[43]  I. Mellman,et al.  Small-molecule ligands bind to a distinct pocket in Ras and inhibit SOS-mediated nucleotide exchange activity , 2012, Proceedings of the National Academy of Sciences.

[44]  R. Nussinov,et al.  Allosteric effects of the oncogenic RasQ61L mutant on Raf-RBD. , 2015, Structure.

[45]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[46]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[47]  A. Persidis High-throughput screening. Advances in robotics and miniturization continue to accelerate drug lead identification. , 1998, Nature biotechnology.

[48]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[49]  C. Dobson Chemical space and biology , 2004, Nature.

[50]  J. Drews Drug discovery: a historical perspective. , 2000, Science.

[51]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[52]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[53]  Mathias Dunkel,et al.  SuperPred: update on drug classification and target prediction , 2014, Nucleic Acids Res..

[54]  Didier Rognan,et al.  Protein-Ligand-Based Pharmacophores: Generation and Utility Assessment in Computational Ligand Profiling , 2012, J. Chem. Inf. Model..

[55]  D. Swinney,et al.  How were new medicines discovered? , 2011, Nature Reviews Drug Discovery.

[56]  Mark M. Davis,et al.  Cell type–specific gene expression differences in complex tissues , 2010, Nature Methods.

[57]  Aurélien Grosdidier,et al.  SwissTargetPrediction: a web server for target prediction of bioactive small molecules , 2014, Nucleic Acids Res..

[58]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[59]  A. Persidis High-throughput screening , 1998, Bio/Technology.

[60]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[61]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[62]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[63]  Xiaomin Luo,et al.  TarFisDock: a web server for identifying drug targets with docking approach , 2006, Nucleic Acids Res..

[64]  Ziv Bar-Joseph,et al.  Evaluation of different biological data and computational classification methods for use in protein interaction prediction , 2006, Proteins.

[65]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[66]  Didier Rognan,et al.  Structure‐Based Approaches to Target Fishing and Ligand Profiling , 2010, Molecular informatics.

[67]  S. Knapp,et al.  The (un)targeted cancer kinome. , 2010, Nature chemical biology.

[68]  Kara Dolinski,et al.  The BioGRID interaction database: 2015 update , 2014, Nucleic Acids Res..