Dimensionality Reduction based Transfer Learning applied to Pharmacogenomics Databases

Recent years have observed a number of Pharmacogenomics databases being published that enable testing of various predictive modeling techniques for personalized therapy applications. However, the consistencies between the databases are usually limited in spite of having significant number of common cell lines and drugs. In this article, we consider the problem of whether we can use the model learned from one secondary database to improve the prediction for the other target database. We illustrate using two pharmacogenomics databases that representing the databases using common basis vectors can improve prediction performance as compared to the naive application of a model trained on one database to another. We also elucidate the robustness of using PCA based basis vectors for scenarios with low correlated input features.

[1]  Jochen Garcke,et al.  Importance Weighted Inductive Transfer Learning for Regression , 2014, ECML/PKDD.

[2]  Ranadip Pal,et al.  Investigation of model stacking for drug sensitivity prediction , 2017, BMC Bioinformatics.

[3]  Ranadip Pal,et al.  Analyzing drug sensitivity prediction based on dose response curve characteristics , 2016, 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[4]  Ranadip Pal,et al.  Heterogeneity Aware Random Forest for Drug Sensitivity Prediction , 2017, Scientific Reports.

[5]  Nci Dream Community A community effort to assess and improve drug sensitivity prediction algorithms , 2014 .

[6]  Ranadip Pal,et al.  IntegratedMRF: random forest‐based framework for integrating prediction from different data types , 2017, Bioinform..

[7]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.

[8]  Su-In Lee,et al.  Extracting a low-dimensional description of multiple gene expression datasets reveals a potential driver for tumor-associated stroma in ovarian cancer , 2016 .

[9]  Sridhar Ramaswamy,et al.  Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells , 2012, Nucleic Acids Res..

[10]  Benjamin Haibe-Kains,et al.  Inconsistency in large pharmacogenomic studies , 2013, Nature.

[11]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[12]  Ranadip Pal,et al.  An investigation of proteomic data for application in precision medicine , 2018, 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[13]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ranadip Pal,et al.  Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction , 2015, Cancer informatics.