Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel

Motivation: Recent large-scale omics initiatives have catalogued the somatic alterations of cancer cell line panels along with their pharmacological response to hundreds of compounds. In this study, we have explored these data to advance computational approaches that enable more effective and targeted use of current and future anticancer therapeutics. Results: We modelled the 50% growth inhibition bioassay end-point (GI50) of 17 142 compounds screened against 59 cancer cell lines from the NCI60 panel (941 831 data-points, matrix 93.08% complete) by integrating the chemical and biological (cell line) information. We determine that the protein, gene transcript and miRNA abundance provide the highest predictive signal when modelling the GI50 endpoint, which significantly outperformed the DNA copy-number variation or exome sequencing data (Tukey’s Honestly Significant Difference, P <0.05). We demonstrate that, within the limits of the data, our approach exhibits the ability to both interpolate and extrapolate compound bioactivities to new cell lines and tissues and, although to a lesser extent, to dissimilar compounds. Moreover, our approach outperforms previous models generated on the GDSC dataset. Finally, we determine that in the cases investigated in more detail, the predicted drug-pathway associations and growth inhibition patterns are mostly consistent with the experimental data, which also suggests the possibility of identifying genomic markers of drug sensitivity for novel compounds on novel cell lines. Contact: terez@pasteur.fr; ab454@ac.cam.uk Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[2]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Isidro Cortes-Ciriano,et al.  Comparing the Influence of Simulated Experimental Errors on 12 Machine Learning Algorithms in Bioactivity Modeling Using 12 Diverse Data Sets , 2015, J. Chem. Inf. Model..

[4]  William C Reinhold,et al.  Exon array analyses across the NCI-60 reveal potential regulation of TOP1 by transcription pausing at guanosine quartets in the first intron. , 2010, Cancer research.

[5]  Andreas Bender,et al.  How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space , 2014, J. Chem. Inf. Model..

[6]  S. Ramaswamy,et al.  Systematic identification of genomic markers of drug sensitivity in cancer cells , 2012, Nature.

[7]  Krister Wennerberg,et al.  Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix Factorization , 2014, J. Chem. Inf. Model..

[8]  Mathias Wilhelm,et al.  Global proteome analysis of the NCI-60 cell line panel. , 2013, Cell reports.

[9]  Julio Saez-Rodriguez,et al.  Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties , 2012, PloS one.

[10]  Isidro Cortes-Ciriano,et al.  Temperature Accelerated Molecular Dynamics with Soft-Ratcheting Criterion Orients Enhanced Sampling by Low-Resolution Information. , 2015, Journal of chemical theory and computation.

[11]  M. Eileen Dolan,et al.  Cancer pharmacogenomics: strategies and challenges , 2012, Nature Reviews Genetics.

[12]  Robert P. Sheridan,et al.  Using Random Forest To Model the Domain Applicability of Another Random Forest Model , 2013, J. Chem. Inf. Model..

[13]  G. V. van Westen,et al.  Structure-Based Identification of OATP1B1/3 Inhibitors , 2013, Molecular Pharmacology.

[14]  M. Sadelain,et al.  Abstract 3499: CD56 targeted chimeric antigen receptors for immunotherapy of multiple myeloma , 2012 .

[15]  William C Reinhold,et al.  Proteomic profiling of the NCI-60 cancer cell lines using new high-density reverse-phase lysate microarrays , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Juho Rousu,et al.  Structured Output Prediction of Anti-cancer Drug Activity , 2010, PRIB.

[17]  Jean-Philippe Vert,et al.  Virtual screening of GPCRs: An in silico chemogenomics approach , 2008, BMC Bioinformatics.

[18]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[19]  J. Weinstein Drug discovery: Cell lines battle cancer , 2012, Nature.

[20]  Benjamin Haibe-Kains,et al.  Inconsistency in large pharmacogenomic studies , 2013, Nature.

[21]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[22]  J. Weinstein,et al.  High Resolution Copy Number Variation Data in the NCI-60 Cancer Cell Lines from Whole Genome Microarrays Accessible through CellMiner , 2014, PloS one.

[23]  Guillaume Bouvier,et al.  Functional Motions Modulating VanA Ligand Binding Unraveled by Self-Organizing Maps , 2014, J. Chem. Inf. Model..

[24]  Scott Boyer,et al.  Introducing Conformal Prediction in Predictive Modeling. A Transparent and Flexible Alternative to Applicability Domain Determination , 2014, J. Chem. Inf. Model..

[25]  Brendan Borrell,et al.  How accurate are cancer cell lines? , 2010, Nature.

[26]  D A Scudiero,et al.  Display and analysis of patterns of differential activity of drugs against human tumor cell lines: development of mean graph and COMPARE algorithm. , 1989, Journal of the National Cancer Institute.

[27]  P. Meltzer,et al.  The exomes of the NCI-60 panel: a genomic resource for cancer biology and systems pharmacology. , 2013, Cancer research.

[28]  G. S. Johnson,et al.  An Information-Intensive Approach to the Molecular Pharmacology of Cancer , 1997, Science.

[29]  Sudhir Varma,et al.  DNA fingerprinting of the NCI-60 cell line panel , 2009, Molecular Cancer Therapeutics.

[30]  John N Weinstein,et al.  Predicting drug sensitivity and resistance: profiling ABC transporter genes in cancer cells. , 2004, Cancer cell.

[31]  R. Pal,et al.  An Ensemble Based Top Performing Approach for NCI-DREAM Drug Sensitivity Prediction Challenge , 2014, PloS one.

[32]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[33]  Gilles Marcou,et al.  Computational chemogenomics: Is it more than inductive transfer? , 2014, Journal of Computer-Aided Molecular Design.

[34]  Gerard J. P. van Westen,et al.  Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets , 2011 .

[35]  Isidro Cortes-Ciriano,et al.  Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects , 2015 .

[36]  Mohammad Fallahi-Sichani,et al.  Metrics other than potency reveal systematic variation in responses to cancer drugs. , 2013, Nature chemical biology.

[37]  Isidro Cortes-Ciriano,et al.  Chemically Aware Model Builder (camb): an R package for property and bioactivity modelling of small molecules , 2015, Journal of Cheminformatics.

[38]  N. Cox,et al.  Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines , 2014, Genome Biology.

[39]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[40]  Howard A. Fine,et al.  Predicting in vitro drug sensitivity using Random Forests , 2011, Bioinform..

[41]  R. Shoemaker The NCI60 human tumour cell line anticancer drug screen , 2006, Nature Reviews Cancer.

[42]  K. Kohn,et al.  CellMiner: a web-based suite of genomic and pharmacologic tools to explore transcript and drug patterns in the NCI-60 cell line set. , 2012, Cancer research.

[43]  J N Weinstein,et al.  Neural computing in cancer drug development: predicting mechanism of action. , 1992, Science.

[44]  Justin Guinney,et al.  Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data , 2013, Pacific Symposium on Biocomputing.

[45]  Sven Bergmann,et al.  A modular approach for integrative analysis of large-scale gene-expression and drug-response data , 2008, Nature Biotechnology.

[46]  J. Adams,et al.  Development of the Proteasome Inhibitor Velcade™ (Bortezomib) , 2004, Cancer investigation.

[47]  Peter C. Fox,et al.  Statistical variation in progressive scrambling , 2004, J. Comput. Aided Mol. Des..

[48]  Andreas Bender,et al.  Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier , 2004, J. Chem. Inf. Model..

[49]  Jean-Philippe Vert,et al.  Protein-ligand interaction prediction: an improved chemogenomics approach , 2008, Bioinform..

[50]  Andreas Bender,et al.  How Similar Are Similarity Searching Methods? A Principal Component Analysis of Molecular Descriptor Space , 2009, J. Chem. Inf. Model..

[51]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[52]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.

[53]  Michael Krauthammer,et al.  Structural similarity assessment for drug sensitivity prediction in cancer , 2009, BMC Bioinformatics.

[54]  Isidro Cortes-Ciriano,et al.  Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling , 2015, Journal of Cheminformatics.

[55]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.