A systematic analysis of genomics-based modeling approaches for prediction of drug response to cytotoxic chemotherapies

BackgroundThe availability and generation of large amounts of genomic data has led to the development of a new paradigm in cancer treatment emphasizing a precision approach at the molecular and genomic level. Statistical modeling techniques aimed at leveraging broad scale in vitro, in vivo, and clinical data for precision drug treatment has become an active area of research. As a rapidly developing discipline at the crossroads of medicine, computer science, and mathematics, techniques ranging from accepted to those on the cutting edge of artificial intelligence have been utilized. Given the diversity and complexity of these techniques a systematic understanding of fundamental modeling principles is essential to contextualize influential factors to better understand results and develop new approaches.MethodsUsing data available from the Genomics of Drug Sensitivity in Cancer (GDSC) and the NCI60 we explore principle components regression, linear and non-linear support vector regression, and artificial neural networks in combination with different implementations of correlation based feature selection (CBF) on the prediction of drug response for several cytotoxic chemotherapeutic agents.ResultsOur results indicate that the regression method and features used have marginal effects on Spearman correlation between the predicted and measured values as well as prediction error. Detailed analysis of these results reveal that the bulk relationship between tissue of origin and drug response is a major driving factor in model performance.ConclusionThese results display one of the challenges in building predictive models for drug response in pan-cancer models. Mainly, that bulk genotypic traits where the signal to noise ratio is high is the dominant behavior captured in these models. This suggests that improved techniques of feature selection that can discriminate individual cell response from histotype response will yield more successful pan-cancer models.

[1]  T. Crook,et al.  Why does cytotoxic chemotherapy cure only some cancers? , 2009, Nature Clinical Practice Oncology.

[2]  Jingqi Wang,et al.  A systematic analysis of FDA-approved anticancer drugs , 2017, BMC Systems Biology.

[3]  K. Kohn,et al.  CellMiner: a web-based suite of genomic and pharmacologic tools to explore transcript and drug patterns in the NCI-60 cell line set. , 2012, Cancer research.

[4]  Benjamin Haibe-Kains,et al.  Inconsistency in large pharmacogenomic studies , 2013, Nature.

[5]  Gamal Attiya,et al.  Classification of human cancer diseases by gene expression profiles , 2017, Appl. Soft Comput..

[6]  I. Weinstein Addiction to Oncogenes--the Achilles Heal of Cancer , 2002, Science.

[7]  Jae K. Lee,et al.  A strategy for predicting the chemosensitivity of human cancers and its application to drug discovery , 2007, Proceedings of the National Academy of Sciences.

[8]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[9]  C. Sawyers,et al.  Targeted cancer therapy , 2004, Nature.

[10]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[11]  Hans-Hermann Bock,et al.  Probabilistic Aspects in Cluster Analysis , 1989 .

[12]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[13]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[14]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  A. Joe,et al.  Oncogene addiction. , 2008, Cancer research.

[17]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumors , 2012, Nature.

[18]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[19]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[20]  Justin Guinney,et al.  Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data , 2013, Pacific Symposium on Biocomputing.

[21]  Krister Wennerberg,et al.  Systematic identification of feature combinations for predicting drug response with Bayesian multi-view multi-task linear regression , 2017, Bioinform..

[22]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[23]  Julio Saez-Rodriguez,et al.  Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties , 2012, PloS one.

[24]  C. Twelves,et al.  Cytotoxic chemotherapy: Still the mainstay of clinical practice for all subtypes metastatic breast cancer. , 2016, Critical reviews in oncology/hematology.

[25]  J. Olson,et al.  The role of cytotoxic chemotherapy in the management of progressive glioblastoma , 2014, Journal of Neuro-Oncology.

[26]  Jae K. Lee,et al.  Concordant gene expression signatures predict clinical outcomes of cancer patients undergoing systemic therapy. , 2009, Cancer research.

[27]  William C Reinhold,et al.  CellMiner: a relational database and query tool for the NCI-60 cancer cell lines , 2009, BMC Genomics.

[28]  A. Irisawa,et al.  Complete response of anaplastic pancreatic carcinoma to paclitaxel treatment selected by chemosensitivity testing , 2010, International Journal of Clinical Oncology.

[29]  N. Petrelli,et al.  A review of the evolution of systemic chemotherapy in the management of colorectal cancer. , 2015, Clinical colorectal cancer.

[30]  Hugues Bersini,et al.  A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[31]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[32]  Steven J. M. Jones,et al.  Comprehensive molecular portraits of human breast tumours , 2013 .

[33]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[34]  P. Staib,et al.  Prediction of individual response to chemotherapy in patients with acute myeloid leukaemia using the chemosensitivity index Ci , 2005, British journal of haematology.

[35]  Tero Aittokallio,et al.  Drug response prediction by inferring pathway-response associations with kernelized Bayesian matrix factorization , 2016, Bioinform..

[36]  Yang Wang,et al.  Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. , 2018, Cancer genomics & proteomics.

[37]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.

[38]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[39]  Valerio Persico,et al.  Big Data for Health , 2019, Encyclopedia of Big Data Technologies.

[40]  M. Kris,et al.  Chemotherapy remains an essential element of personalized care for persons with lung cancers. , 2016, Annals of oncology : official journal of the European Society for Medical Oncology.

[41]  Joshua M. Dempster,et al.  Genetic and transcriptional evolution alters cancer cell line drug response , 2018, Nature.

[42]  Catarina Eloy,et al.  Classification of breast cancer histology images using Convolutional Neural Networks , 2017, PloS one.

[43]  Sridhar Ramaswamy,et al.  Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells , 2012, Nucleic Acids Res..

[44]  G. Sledge,et al.  Targeted Therapy for Cancer in the Genomic Era. , 2015, Cancer journal.

[45]  L. Staudt,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[46]  M. Schena Genome analysis with gene expression microarrays. , 1996, BioEssays : news and reviews in molecular, cellular and developmental biology.

[47]  A. Hauschild,et al.  Improved survival with vemurafenib in melanoma with BRAF V600E mutation. , 2011, The New England journal of medicine.

[48]  R. Coleman,et al.  Chemosensitivity testing with ChemoFx and overall survival in primary ovarian cancer. , 2010, American journal of obstetrics and gynecology.

[49]  K. Fessele The Rise of Big Data in Oncology. , 2018, Seminars in oncology nursing.

[50]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[51]  P. Meltzer,et al.  The exomes of the NCI-60 panel: a genomic resource for cancer biology and systems pharmacology. , 2013, Cancer research.

[52]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[53]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[54]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[55]  Daniel L. Gustafson,et al.  Intra- and interspecies gene expression models for predicting drug response in canine osteosarcoma , 2016, BMC Bioinformatics.