Deep learning of pharmacogenomics resources: moving towards precision oncology

The recent accumulation of cancer genomic data provides an opportunity to understand how a tumor's genomic characteristics can affect its responses to drugs. This field, called pharmacogenomics, is a key area in the development of precision oncology. Deep learning (DL) methodology has emerged as a powerful technique to characterize and learn from rapidly accumulating pharmacogenomics data. We introduce the fundamentals and typical model architectures of DL. We review the use of DL in classification of cancers and cancer subtypes (diagnosis and treatment stratification of patients), prediction of drug response and drug synergy for individual tumors (treatment prioritization for a patient), drug repositioning and discovery and the study of mechanism/mode of action of treatments. For each topic, we summarize current genomics and pharmacogenomics data resources such as pan-cancer genomics data for cancer cell lines (CCLs) and tumors, and systematic pharmacologic screens of CCLs. By revisiting the published literature, including our in-house analyses, we demonstrate the unprecedented capability of DL enabled by rapid accumulation of data resources to decipher complex drug response patterns, thus potentially improving cancer medicine. Overall, this review provides an in-depth summary of state-of-the-art DL methods and up-to-date pharmacogenomics resources and future opportunities and challenges to realize the goal of precision oncology.

[1]  Hugo Kubinyi,et al.  Quantitative structure-activity relationships (QSAR) and molecular modelling in cancer research , 2005, Journal of Cancer Research and Clinical Oncology.

[2]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[3]  Yufei Huang,et al.  GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization , 2018, BMC Systems Biology.

[4]  Mark Lee,et al.  Next-Generation Sequencing of Circulating Tumor DNA for Early Cancer Detection , 2017, Cell.

[5]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[6]  Sergey Nikolenko,et al.  druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. , 2017, Molecular pharmaceutics.

[7]  Volkan Atalay,et al.  Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases , 2018, Briefings Bioinform..

[8]  Feroz Khan,et al.  Cluster based SVR-QSAR modelling for HTS records: an implementation for anticancer leads against human breast cancer. , 2013, Combinatorial chemistry & high throughput screening.

[9]  C. Hutter,et al.  The Cancer Genome Atlas: Creating Lasting Value beyond Its Data , 2018, Cell.

[10]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[11]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[12]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[13]  Kumardeep Chaudhary,et al.  Deep Learning–Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer , 2017, Clinical Cancer Research.

[14]  David Haussler,et al.  TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal. , 2017, Cancer research.

[15]  Bin Li,et al.  Applications of machine learning in drug discovery and development , 2019, Nature Reviews Drug Discovery.

[16]  A. Chinnaiyan,et al.  Precision oncology in the age of integrative genomics , 2018, Nature Biotechnology.

[17]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[18]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[19]  Andreas Bender,et al.  Computational approaches in cheminformatics and bioinformatics , 2012 .

[20]  Thorsten Meinl,et al.  KNIME - the Konstanz information miner: version 2.0 and beyond , 2009, SKDD.

[21]  Quanshi Zhang,et al.  Visual interpretability for deep learning: a survey , 2018, Frontiers of Information Technology & Electronic Engineering.

[22]  Sheng-Yong Yang,et al.  Individualized network-based drug repositioning infrastructure for precision oncology in the panomics era , 2016, Briefings Bioinform..

[23]  Sean Ekins The Next Era: Deep Learning in Pharmaceutical Research , 2016, Pharmaceutical Research.

[24]  Alexander Binder,et al.  Evaluating the Visualization of What a Deep Neural Network Has Learned , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[25]  R. Todeschini,et al.  Molecular Descriptors for Chemoinformatics: Volume I: Alphabetical Listing / Volume II: Appendices, References , 2009 .

[26]  Thomas Blaschke,et al.  The rise of deep learning in drug discovery. , 2018, Drug discovery today.

[27]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[28]  Joshua A. Bittker,et al.  Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. , 2015, Cancer discovery.

[29]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[30]  C. Harris,et al.  Biomarker development in the precision medicine era: lung cancer as a case study , 2016, Nature Reviews Cancer.

[31]  Vicki Brower,et al.  NCI-MATCH pairs tumor mutations with matching drugs , 2015, Nature Biotechnology.

[32]  Arul M Chinnaiyan,et al.  Translating cancer genomes and transcriptomes for precision oncology , 2016, CA: a cancer journal for clinicians.

[33]  J. Mesirov,et al.  The Molecular Signatures Database Hallmark Gene Set Collection , 2015 .

[34]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[35]  M. DePristo,et al.  Deep learning of genomic variation and regulatory network data. , 2018, Human molecular genetics.

[36]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[37]  Yang Xie,et al.  A community computational challenge to predict the activity of pairs of compounds Citation , 2015 .

[38]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[39]  Yair Benita,et al.  An Unbiased Oncology Compound Screen to Identify Novel Combination Strategies , 2016, Molecular Cancer Therapeutics.

[40]  Ranadip Pal,et al.  Investigation of model stacking for drug sensitivity prediction , 2018, BMC Bioinformatics.

[41]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[42]  Artem Cherkasov,et al.  DeepCOP: deep learning-based approach to predict gene regulating effects of small molecules , 2019, Bioinform..

[43]  Albert-László Barabási,et al.  Network-based prediction of drug combinations , 2019, Nature Communications.

[44]  Gary D Bader,et al.  International network of cancer genome projects , 2010, Nature.

[45]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[46]  Igor F Tsigelny,et al.  Artificial intelligence in drug combination therapy , 2019, Briefings Bioinform..

[47]  Rajarshi Guha,et al.  Chemical Informatics Functionality in R , 2007 .

[48]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[49]  Jing Chen,et al.  NDEx, the Network Data Exchange. , 2015, Cell systems.

[50]  Roded Sharan,et al.  Using deep learning to model the hierarchical structure and function of a cell , 2018, Nature Methods.

[51]  C. Steinbeck,et al.  Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics. , 2006, Current pharmaceutical design.

[52]  Yufei Huang,et al.  Predicting drug response of tumors from integrated genomic profiles by deep neural networks , 2018, BMC Medical Genomics.

[53]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[54]  Sunghwan Kim,et al.  Getting the most out of PubChem for virtual screening , 2016, Expert opinion on drug discovery.

[55]  Anita Grigoriadis,et al.  Big Data: the challenge for small research groups in the era of cancer genomics , 2015, British Journal of Cancer.

[56]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[57]  Fabian J Theis,et al.  Deep learning: new computational modelling techniques for genomics , 2019, Nature Reviews Genetics.

[58]  Angela N. Brooks,et al.  A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles , 2017, Cell.

[59]  B. Taylor,et al.  Implementing Genome-Driven Oncology , 2017, Cell.

[60]  D. Brat,et al.  Predicting cancer outcomes from histology and genomics using convolutional networks , 2017, Proceedings of the National Academy of Sciences.

[61]  Tatsuya Takagi,et al.  Mordred: a molecular descriptor calculator , 2018, Journal of Cheminformatics.

[62]  Emanuel J. V. Gonçalves,et al.  Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens , 2019, Nature.

[63]  Emanuel J. V. Gonçalves,et al.  A Landscape of Pharmacogenomic Interactions in Cancer , 2016, Cell.

[64]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[65]  W. Bodmer,et al.  Cancer cell lines for drug discovery and development. , 2014, Cancer research.

[66]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[67]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[68]  Xin Zhou,et al.  Pan-cancer genome and transcriptome analyses of 1,699 pediatric leukemias and solid tumors , 2018, Nature.

[69]  Malaikannan Sankarasubbu,et al.  Convolutional Neural Networks In Classifying Cancer Through DNA Methylation , 2018, ArXiv.

[70]  A. Butte,et al.  Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[71]  C. Cole,et al.  COSMIC: High‐Resolution Cancer Genetics Using the Catalogue of Somatic Mutations in Cancer , 2016, Current protocols in human genetics.

[72]  Larry Rubinstein,et al.  The National Cancer Institute ALMANAC: A Comprehensive Screening Resource for the Detection of Anticancer Drug Pairs with Enhanced Therapeutic Activity. , 2017, Cancer research.

[73]  Julio Saez-Rodriguez,et al.  Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties , 2012, PloS one.

[74]  M. Vidal,et al.  A genome-wide positioning systems network algorithm for in silico drug repurposing , 2019, Nature Communications.

[75]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[76]  George Papadatos,et al.  ChEMBL web services: streamlining access to drug discovery data and utilities , 2015, Nucleic Acids Res..

[77]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[78]  Jean-Pierre Gillet,et al.  Redefining the relevance of established cancer cell lines to the study of mechanisms of clinical anti-cancer drug resistance , 2011, Proceedings of the National Academy of Sciences.

[79]  Joshua A. Bittker,et al.  Correlating chemical sensitivity and basal gene expression reveals mechanism of action , 2015, Nature chemical biology.

[80]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[81]  David Ryan Koes,et al.  Open source molecular modeling. , 2016, Journal of molecular graphics & modelling.

[82]  G. S. Johnson,et al.  An Information-Intensive Approach to the Molecular Pharmacology of Cancer , 1997, Science.

[83]  Thomas Gärtner,et al.  Support-Vector-Machine-Based Ranking Significantly Improves the Effectiveness of Similarity Searching Using 2D Fingerprints and Multiple Reference Compounds , 2008, J. Chem. Inf. Model..

[84]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[85]  Peter Szolovits,et al.  Deep Learning Benchmarks on L1000 Gene Expression Data , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[86]  Uri Shaham,et al.  DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network , 2016, BMC Medical Research Methodology.

[87]  Jun Wang,et al.  Predicting Anticancer Drug Responses Using a Dual-Layer Integrated Cell Line-Drug Network Model , 2015, PLoS Comput. Biol..

[88]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[89]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[90]  Shivani Agarwal,et al.  Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach , 2010, J. Chem. Inf. Model..

[91]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[92]  A. Butte,et al.  Leveraging big data to transform target selection and drug discovery , 2016, Clinical pharmacology and therapeutics.

[93]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[94]  Vijay S. Pande,et al.  Massively Multitask Networks for Drug Discovery , 2015, ArXiv.

[95]  Kristen Fortney,et al.  Drug Repurposing Using Deep Embeddings of Gene Expression Profiles. , 2018, Molecular pharmaceutics.

[96]  Andreas Bender,et al.  Computational Approaches in Cheminformatics and Bioinformatics: Guha/Computational Cheminfo and Bioinfo , 2011 .

[97]  Tamer Kahveci,et al.  Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics , 2017, BCB.

[98]  Sergey Plis,et al.  Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data. , 2016, Molecular pharmaceutics.

[99]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[100]  Chirag J Patel,et al.  A standard database for drug repositioning , 2017, Scientific Data.

[101]  Suryanarayana Yaddanapudi,et al.  Changing Trends in Computational Drug Repositioning , 2018, Pharmaceuticals.

[102]  Xiangrong Liu,et al.  deepDR: a network-based deep learning approach to in silico drug repositioning , 2019, Bioinform..

[103]  Michael K. Gilson,et al.  Virtual Screening of Molecular Databases Using a Support Vector Machine , 2005, J. Chem. Inf. Model..

[104]  Liwei Zhang,et al.  Synergistic Drug Combination Prediction by Integrating Multi-omics Data in Deep Learning Models , 2018, Methods in molecular biology.

[105]  H. Dressman,et al.  Genomic signatures to guide the use of chemotherapeutics , 2006, Nature Medicine.

[106]  Russ B. Altman,et al.  PharmGKB: the Pharmacogenetics Knowledge Base , 2002, Nucleic Acids Res..

[107]  Yufei Huang,et al.  Convolutional neural network models for cancer type prediction based on gene expression , 2019, BMC Medical Genomics.

[108]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[109]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[110]  Andrey Kazennov,et al.  The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology , 2016, Oncotarget.

[111]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[112]  G. Hessler,et al.  Artificial Intelligence in Drug Design , 2018, Molecules.

[113]  Thomas Blaschke,et al.  Application of Generative Autoencoder in De Novo Molecular Design , 2017, Molecular informatics.

[114]  Xing Chen,et al.  Anti-cancer Drug Response Prediction Using Neighbor-Based Collaborative Filtering with Global Effect Removal , 2018, Molecular therapy. Nucleic acids.

[115]  Bo Zhang,et al.  Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus , 2018, Nature Communications.

[116]  Joshua C. Gilbert,et al.  An Interactive Resource to Identify Cancer Genetic and Lineage Dependencies Targeted by Small Molecules , 2013, Cell.

[117]  Olexandr Isayev,et al.  Deep reinforcement learning for de novo drug design , 2017, Science Advances.

[118]  Alexander A. Morgan,et al.  Clinical assessment incorporating a personal genome , 2010, The Lancet.

[119]  Robert J. Lonigro,et al.  Integrative Clinical Genomics of Metastatic Cancer , 2017, Nature.

[120]  David K. Gifford,et al.  Convolutional neural network architectures for predicting DNA–protein binding , 2016, Bioinform..

[121]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[122]  Ludmila V. Danilova,et al.  Detection and localization of surgically resectable cancers with a multi-analyte blood test , 2018, Science.

[123]  P. Meltzer,et al.  RNA sequencing of the NCI-60: Integration into CellMiner and CellMiner CDB. , 2019, Cancer research.

[124]  Manuela Porru,et al.  Patient-derived xenografts: a relevant preclinical model for drug development , 2016, Journal of Experimental & Clinical Cancer Research.

[125]  Laura M. Heiser,et al.  Tumor-Derived Cell Lines as Molecular Models of Cancer Pharmacogenomics , 2015, Molecular Cancer Research.

[126]  Andreas Bender,et al.  DeepSynergy: predicting anti-cancer drug synergy with Deep Learning , 2017, Bioinform..

[127]  Jonas Boström,et al.  Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design , 2019, J. Chem. Inf. Model..

[128]  S. Mohamad R. Soroushmehr,et al.  Deep Learning in Pharmacogenomics: From Gene Regulation to Patient Stratification , 2018, Pharmacogenomics.

[129]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[130]  Kathleen M Jagodnik,et al.  Massive mining of publicly available RNA-seq data from human and mouse , 2017, Nature Communications.

[131]  Jean-Pierre Gillet,et al.  The clinical relevance of cancer cell lines. , 2013, Journal of the National Cancer Institute.

[132]  Oleg Devinyak,et al.  3D-MoRSE descriptors explained. , 2014, Journal of molecular graphics & modelling.

[133]  Reza Ghaeini,et al.  A Deep Learning Approach for Cancer Detection and Relevant Gene Identification , 2017, PSB.

[134]  Damian Szklarczyk,et al.  STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data , 2015, Nucleic Acids Res..

[135]  Phillip G. Montgomery,et al.  Defining a Cancer Dependency Map , 2017, Cell.

[136]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[137]  Chi-Ping Day,et al.  Preclinical Mouse Cancer Models: A Maze of Opportunities and Challenges , 2015, Cell.

[138]  Zhipeng Jia,et al.  Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features , 2017, BMC Bioinformatics.

[139]  Moriah H Nissan,et al.  OncoKB: A Precision Oncology Knowledge Base. , 2017, JCO precision oncology.

[140]  Tao Jiang,et al.  ChemmineR: a compound mining framework for R , 2008, Bioinform..

[141]  Thorsten Meinl,et al.  KNIME-CDK: Workflow-driven cheminformatics , 2013, BMC Bioinformatics.

[142]  Z. Bar-Joseph,et al.  Using neural networks for reducing the dimensions of single-cell RNA-Seq data , 2017, Nucleic acids research.

[143]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[144]  J. Telenius,et al.  Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints , 2017, Genome research.

[145]  Joseph Gomes,et al.  MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.

[146]  Xun Zhu,et al.  Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data , 2018, PLoS Comput. Biol..

[147]  Rajiv Narayan,et al.  The GCTx format and cmap{Py, R, M, J} packages: resources for optimized storage and integrated traversal of annotated dense matrices , 2018, Bioinform..

[148]  Sridhar Ramaswamy,et al.  Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells , 2012, Nucleic Acids Res..

[149]  Ji Wan,et al.  Clustering single-cell RNA-seq data with a model-based deep learning approach , 2019, Nature Machine Intelligence.

[150]  Brendan J. Frey,et al.  Generating and designing DNA with deep generative models , 2017, ArXiv.

[151]  Kwok-Kin Wong,et al.  New cast for a new era: preclinical cancer drug development revisited. , 2013, The Journal of clinical investigation.

[152]  Fangfang Xia,et al.  Predicting tumor cell line response to drug pairs with deep learning , 2018, BMC Bioinformatics.

[153]  Michael C. Heinold,et al.  The landscape of genomic alterations across childhood cancers , 2018, Nature.

[154]  Gianluca Bontempi,et al.  TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data , 2015, Nucleic acids research.

[155]  Dong Wei,et al.  Comprehensive anticancer drug response prediction based on a simple cell line-drug complex network model , 2019, BMC Bioinformatics.

[156]  Tae Soon Kim,et al.  Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature , 2018, Scientific Reports.

[157]  Michael P. Morrissey,et al.  Pharmacogenomic agreement between two cancer cell line data sets , 2015, Nature.