Transfer Learning Approaches to Improve Drug Sensitivity Prediction in Multiple Myeloma Patients

Traditional machine learning approaches to drug sensitivity prediction assume that training data and test data must be in the same feature space and have the same underlying distribution. However, in real-world applications, this assumption does not hold. For example, we sometimes have limited training data for the task of drug sensitivity prediction in multiple myeloma patients (target task), but we have sufficient auxiliary data for the task of drug sensitivity prediction in patients with another cancer type (related task), where the auxiliary data for the related task are in a different feature space or have a different distribution. In such cases, transfer learning, if applied correctly, would improve the performance of prediction algorithms on the test data of the target task via leveraging the auxiliary data from the related task. In this paper, we present two transfer learning approaches that combine the auxiliary data from the related task with the training data of the target task to improve the prediction performance on the test data of the target task. We evaluate the performance of our transfer learning approaches exploiting three auxiliary data sets and compare them against baseline approaches using the area under the receiver operating characteristic curve on the test data of the target task. Experimental results demonstrate the good performance of our approaches and their superiority over the baseline approaches when auxiliary data are incorporated.

[1]  Jorge S Reis-Filho,et al.  Genetic heterogeneity and cancer drug resistance. , 2012, The Lancet. Oncology.

[2]  David Grimes,et al.  Randomized phase II trial of the efficacy and safety of trastuzumab combined with docetaxel in patients with human epidermal growth factor receptor 2-positive metastatic breast cancer administered as first-line treatment: the M77001 study group. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  N. Cox,et al.  Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines , 2014, Genome Biology.

[4]  Howard A. Fine,et al.  Predicting in vitro drug sensitivity using Random Forests , 2011, Bioinform..

[5]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[6]  Vasile Palade,et al.  microPred: effective classification of pre-miRNAs for human miRNA gene prediction , 2009, Bioinform..

[7]  A. Cheng,et al.  CIP2A mediates effects of bortezomib on phospho-Akt and apoptosis in hepatocellular carcinoma cells , 2010, Oncogene.

[8]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[9]  Himanshu S. Bhatt,et al.  Submitted to Ieee Transactions on Image Processing 1 Improving Cross-resolution Face Matching Using Ensemble Based Co-transfer Learning , 2022 .

[10]  Zhi Wei,et al.  Learning approaches to improve prediction of drug sensitivity in breast cancer patients , 2016, EMBC.

[11]  J. Grandis,et al.  Bortezomib induces apoptosis via Bim and Bik up-regulation and synergizes with cisplatin in the killing of head and neck squamous cell carcinoma cells , 2008, Molecular Cancer Therapeutics.

[12]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[13]  Patricia Soteropoulos,et al.  Effective classification of microRNA precursors using feature mining and AdaBoost algorithms. , 2013, Omics : a journal of integrative biology.

[14]  M. Piccart,et al.  Bortezomib/docetaxel combination therapy in patients with anthracycline-pretreated advanced/metastatic breast cancer: a phase I/II dose-escalation study , 2008, British Journal of Cancer.

[15]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[16]  L. Wienkers,et al.  Predicting in vivo drug interactions from in vitro drug discovery data , 2005, Nature Reviews Drug Discovery.

[17]  Jason Tsong-Li Wang,et al.  Inferring Gene Regulatory Networks by Combining Supervised and Unsupervised Methods , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[18]  D. Roden,et al.  The genetic basis of variability in drug responses , 2002, Nature Reviews Drug Discovery.

[19]  A. Jemal,et al.  Cancer statistics, 2016 , 2016, CA: a cancer journal for clinicians.

[20]  Michael W. Mahoney,et al.  rCUR: an R package for CUR matrix decomposition , 2012, BMC Bioinformatics.

[21]  Julio Saez-Rodriguez,et al.  Machine Learning Prediction of Cancer Cell Sensitivity to Drugs Based on Genomic and Chemical Properties , 2012, PloS one.

[22]  Hua Liu,et al.  A Randomized Phase 2 Study of Erlotinib Alone and in Combination with Bortezomib in Previously Treated Advanced Non-small Cell Lung Cancer , 2009, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[23]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[24]  Alexander Kamb,et al.  Why is cancer drug discovery so difficult? , 2007, Nature Reviews Drug Discovery.

[25]  O. Elemento,et al.  Cancer systems biology: embracing complexity to develop better anticancer therapeutic strategies , 2014, Oncogene.

[26]  Alcione de Paiva Oliveira,et al.  Mirnacle: machine learning with SMOTE and random forest for improving selectivity in pre-miRNA ab initio prediction , 2016, BMC Bioinformatics.

[27]  S. Ramaswamy,et al.  Systematic identification of genomic markers of drug sensitivity in cancer cells , 2012, Nature.

[28]  Pei Wang,et al.  Integrative random forest for gene regulatory network inference , 2015, Bioinform..

[29]  D. Pe’er,et al.  Integration of Genomic Data Enables Selective Discovery of Breast Cancer Drivers , 2014, Cell.

[30]  Chih-Ming Ho,et al.  Optimization of drug combinations using Feedback System Control , 2016, Nature Protocols.

[31]  Kerstin Amann,et al.  The proteasome inhibitor bortezomib depletes plasma cells and protects mice with lupus-like disease from nephritis , 2008, Nature Medicine.

[32]  William Stafford Noble,et al.  Machine learning applications in genetics and genomics , 2015, Nature Reviews Genetics.

[33]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[34]  Vivien Marx,et al.  Cancer: A most exceptional response , 2015, Nature.

[35]  Jason Tsong-Li Wang,et al.  MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach , 2017, BioMed research international.

[36]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[37]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[38]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[39]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[40]  Haruhiko Kimura,et al.  LVQ-SMOTE – Learning Vector Quantization based Synthetic Minority Over–sampling Technique for biomedical data , 2013, BioData Mining.

[41]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.

[42]  Jason Tsong-Li Wang,et al.  A New Approach to Link Prediction in Gene Regulatory Networks , 2015, IDEAL.

[43]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[44]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[45]  K. Shen,et al.  Adjuvant Docetaxel or Vinorelbine with or without Trastuzumab for Breast Cancer , 2008 .

[46]  Zhi Wei,et al.  A Noise-Filtering Approach for Cancer Drug Sensitivity Prediction , 2016, ArXiv.

[47]  Mandy Aujla,et al.  Chemotherapy: Treating older breast cancer patients , 2009, Nature Reviews Clinical Oncology.

[48]  Alexander Aliper,et al.  Combinatorial high-throughput experimental and bioinformatic approach identifies molecular pathways linked with the sensitivity to anticancer target drugs , 2015, Oncotarget.

[49]  Jason Tsong-Li Wang,et al.  A Learning Framework to Improve Unsupervised Gene Network Inference , 2016, MLDM.

[50]  Zhi Wei,et al.  Top-k Parametrized Boost , 2014, MIKE.

[51]  Rebecca L. Siegel Mph,et al.  Cancer statistics, 2016 , 2016 .

[52]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[53]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..

[54]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[55]  V. Brusic,et al.  Mathematical modeling for novel cancer drug discovery and development , 2014, Expert opinion on drug discovery.

[56]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .