Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning

Significance Cell lines have been extensively used to study anticancer agents, thereby establishing vast molecular and drug response datasets. Unfortunately, the translation of cell line–derived biomarkers often fails. To bridge this gap between model systems and clinical practice, we developed a mathematical framework to capture gene expression patterns shared between model systems and human tumors in a consensus space. In this space, we trained drug response predictors on a panel of 1,000 cell lines and successfully predicted drug response on approximately 1,300 human tumors. Finally, we derived an approach to interpret the predictors, and we propose potential mechanisms mediating the cytotoxic effects of two drugs. Experimental validation is required to confirm these results. Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus space to capture biological processes common to preclinical models and human tumors and exploits this space to construct drug response predictors that robustly transfer from preclinical models to human tumors. TRANSACT performs favorably compared to four competing approaches, including two deep learning approaches, on a set of 23 drug prediction challenges on The Cancer Genome Atlas and 226 metastatic tumors from the Hartwig Medical Foundation. We demonstrate that response predictions deliver a robust performance for a number of therapies of high clinical importance: platinum-based chemotherapies, gemcitabine, and paclitaxel. In contrast to other approaches, we demonstrate the interpretability of the TRANSACT predictors by correctly identifying known biomarkers of targeted therapies, and we propose potential mechanisms that mediate the resistance to two chemotherapeutic agents.

[1]  S. Yohe,et al.  Cancer Whole Genome Sequencing: The Quest for Comprehensive Genomic Profiling in Routine Oncology Care. , 2021, The Journal of molecular diagnostics : JMD.

[2]  Roded Sharan,et al.  Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients , 2021, Nature Cancer.

[3]  Kathleen T DiNapoli,et al.  Evaluating the transcriptional fidelity of cancer models , 2020, Genome Medicine.

[4]  Hiroshi Mamitsuka,et al.  Improving drug response prediction by integrating multiple data sources: matrix factorization, kernel and network-based approaches , 2019, Briefings Bioinform..

[5]  Wouter M. Kouw,et al.  A Review of Domain Adaptation without Target Labels , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  James M. McFarland,et al.  Global computational alignment of tumor and cell line transcriptional profiles , 2020, Nature Communications.

[7]  Mateusz Maciejewski,et al.  Standard machine learning approaches outperform deep representation learning on phenotype prediction from transcriptomics data , 2020, BMC Bioinformatics.

[8]  Aryo Pradipta Gema,et al.  pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods , 2020, bioRxiv.

[9]  Benjamin Haibe-Kains,et al.  Assessment of modelling strategies for drug response prediction in cell lines and xenografts , 2020, Scientific Reports.

[10]  D. Lauffenburger,et al.  Translating preclinical models to humans , 2020, Science.

[11]  Martin Ester,et al.  AITL: Adversarial Inductive Transfer Learning with input and output space adaptation for pharmacogenomics , 2020, bioRxiv.

[12]  Zhe-Sheng Chen,et al.  The PI3K subunits, P110α and P110β are potential targets for overcoming P-gp and BCRP-mediated MDR in cancer , 2020, Molecular Cancer.

[13]  Joel Nothman,et al.  SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[14]  Aristotelis Tsirigos,et al.  A Deep Learning Framework for Predicting Response to Therapy in Cancer. , 2019, Cell reports.

[15]  A. Butte,et al.  Comprehensive transcriptomic analysis of cell lines as models of primary tumors across 22 tumor types , 2019, Nature Communications.

[16]  Paul J. Hoffman,et al.  Comprehensive Integration of Single-Cell Data , 2018, Cell.

[17]  S. Fereday,et al.  Multiple ABCB1 transcriptional fusions in drug resistant high-grade serous ovarian and breast cancer , 2019, Nature Communications.

[18]  Marcel J. T. Reinders,et al.  PRECISE: a domain adaptation approach to transfer predictors of drug response from pre-clinical models to tumors , 2019, bioRxiv.

[19]  C. Collins,et al.  MOLI: multi-omics late integration with deep neural networks for drug response prediction , 2019, bioRxiv.

[20]  Matteo Manica,et al.  PIMKL: Pathway-Induced Multiple Kernel Learning , 2018, npj Systems Biology and Applications.

[21]  James T. Webber,et al.  Integration of Tumor Genomic Data with Cell Lines Using Multi-dimensional Network Modules Improves Cancer Pharmacogenomics. , 2018, Cell systems.

[22]  Ji‐Won Kim,et al.  BGJ398, A Pan-FGFR Inhibitor, Overcomes Paclitaxel Resistance in Urothelial Carcinoma with FGFR1 Overexpression , 2018, International journal of molecular sciences.

[23]  S. Sleijfer,et al.  Pan-cancer whole genome analyses of metastatic solid tumors , 2018, bioRxiv.

[24]  Xudong Lin,et al.  Deep Variational Metric Learning , 2018, ECCV.

[25]  J. Shih,et al.  Epithelial-mesenchymal transition (EMT) beyond EGFR mutations per se is a common mechanism for acquired resistance to EGFR TKI , 2018, Oncogene.

[26]  Tero Aittokallio,et al.  Machine learning and feature selection for drug response prediction in precision oncology applications , 2018, Biophysical Reviews.

[27]  Luis Tobalina,et al.  How to find the right drug for each patient? Advances and challenges in pharmacogenomics , 2018, Current opinion in systems biology.

[28]  Joshua M. Dempster,et al.  Genetic and transcriptional evolution alters cancer cell line drug response , 2018, Nature.

[29]  Age K. Smilde,et al.  iTOP: inferring the topology of omics data , 2018, bioRxiv.

[30]  Kongming Wu,et al.  EGFR-TKIs resistance via EGFR-independent signaling pathways , 2018, Molecular Cancer.

[31]  Karsten M. Borgwardt,et al.  Kernelized rank learning for personalized drug recommendation , 2017, Bioinform..

[32]  Alioune Ngom,et al.  A review on machine learning principles for multi-view biological data integration , 2016, Briefings Bioinform..

[33]  J. Ji,et al.  Dual PI3K/mTOR inhibitor BEZ235 as a promising therapeutic strategy against paclitaxel-resistant gastric cancer via targeting PI3K/Akt/mTOR pathway , 2018, Cell Death & Disease.

[34]  R. Grossman,et al.  Discovering novel pharmacogenomic biomarkers by imputing drug response in cancer patients from large genomics studies , 2017, Genome research.

[35]  Rameen Beroukhim,et al.  Patient-derived xenografts undergo murine-specific tumor evolution , 2017, Nature Genetics.

[36]  Rob Patro,et al.  Salmon provides fast and bias-aware quantification of transcript expression , 2017, Nature Methods.

[37]  Jin Gu,et al.  Evaluating the molecule-based prediction of clinical drug responses in cancer , 2016, Bioinform..

[38]  Lodewyk F. A. Wessels,et al.  TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types , 2016, Bioinform..

[39]  P. Chakravarty,et al.  ABCB1 (MDR1) induction defines a common resistance mechanism in paclitaxel- and olaparib-resistant ovarian cancer cells , 2016, British Journal of Cancer.

[40]  Måns Magnusson,et al.  MultiQC: summarize analysis results for multiple tools and samples in a single report , 2016, Bioinform..

[41]  Tero Aittokallio,et al.  Drug response prediction by inferring pathway-response associations with kernelized Bayesian matrix factorization , 2016, Bioinform..

[42]  Joshua M. Korn,et al.  High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response , 2015, Nature Medicine.

[43]  Joshy George,et al.  Whole–genome characterization of chemoresistant ovarian cancer , 2015, Nature.

[44]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[45]  Rama Chellappa,et al.  Unsupervised Adaptation Across Domain Shifts by Generating Intermediate Data Representations , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Sarah A Heerboth,et al.  Drug Resistance in Cancer: An Overview , 2014, Cancers.

[47]  Clarence C Lee,et al.  Genomic and transcriptomic plasticity in treatment-naïve ovarian cancer , 2014, Genome research.

[48]  Zhuowen Tu,et al.  Similarity network fusion for aggregating data types on a genomic scale , 2014, Nature Methods.

[49]  M. Ghert,et al.  Lost in translation: animal models and clinical trials in cancer treatment. , 2014, American journal of translational research.

[50]  Harald Binder,et al.  Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures , 2014, PloS one.

[51]  Justin Guinney,et al.  Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data , 2013, Pacific Symposium on Biocomputing.

[52]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[53]  N. Cox,et al.  Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines , 2014, Genome Biology.

[54]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[55]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[56]  Jean-Pierre Gillet,et al.  The clinical relevance of cancer cell lines. , 2013, Journal of the National Cancer Institute.

[57]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[58]  R. Bernards,et al.  Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR , 2012, Nature.

[59]  Jean-Pierre Gillet,et al.  Redefining the relevance of established cancer cell lines to the study of mechanisms of clinical anti-cancer drug resistance , 2011, Proceedings of the National Academy of Sciences.

[60]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[61]  Xavier Robin,et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves , 2011, BMC Bioinformatics.

[62]  S. Ropero,et al.  Knockdown of protein tyrosine phosphatase SHP-1 inhibits G1/S progression in prostate cancer cells through the regulation of components of the cell-cycle machinery , 2010, Oncogene.

[63]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[64]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[65]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[66]  S. Dooley,et al.  Y-box protein-1 is actively secreted through a non-classical pathway and acts as an extracellular mitogen , 2009, EMBO reports.

[67]  E. O’Reilly,et al.  Combination of human tumor necrosis factor-alpha (hTNF-α) gene delivery with gemcitabine is effective in models of pancreatic cancer , 2009, Cancer Gene Therapy.

[68]  B. Schölkopf,et al.  Kernel methods in machine learning , 2007, math/0701907.

[69]  Y. Oda,et al.  Akt-dependent nuclear localization of Y-box-binding protein 1 in acquisition of malignant characteristics by human ovarian cancer cells , 2007, Oncogene.

[70]  Don R. Hush,et al.  An Explicit Description of the Reproducing Kernel Hilbert Spaces of Gaussian RBF Kernels , 2006, IEEE Transactions on Information Theory.

[71]  J. Au,et al.  Expression of Basic Fibroblast Growth Factor Correlates with Resistance to Paclitaxel in Human Patient Tumors , 2006, Pharmaceutical Research.

[72]  F. Bertucci,et al.  Gene expression profiling of breast cell lines identifies potential new basal markers , 2006, Oncogene.

[73]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[74]  A. Hall,et al.  Cdc42 controls the polarity of the actin and microtubule cytoskeletons through two distinct signal transduction pathways , 2005, Journal of Cell Science.

[75]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[76]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[77]  A. Nicholson,et al.  Mutations of the BRAF gene in human cancer , 2002, Nature.

[78]  G. Mills,et al.  Inhibition of phosphatidylinositol 3'-kinase increases efficacy of paclitaxel in in vitro and in vivo ovarian cancer models. , 2002, Cancer research.

[79]  M. van Glabbeke,et al.  New guidelines to evaluate the response to treatment in solid tumors , 2000, Journal of the National Cancer Institute.

[80]  Yiling Lu,et al.  SHP-1 Regulates Lck-induced Phosphatidylinositol 3-Kinase Phosphorylation and Activity* , 1999, The Journal of Biological Chemistry.

[81]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[82]  L. Goldstein MDR1 gene expression in solid tumours. , 1996, European journal of cancer.