Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response.

The identification of therapeutic biomarkers predictive of drug response is crucial in personalized medicine. A number of computational models to predict response of anti-cancer drugs have been developed as the establishment of several pharmacogenomics screening databases. In our study, we proposed a deep cascaded forest model, Deep-Resp-Forest, to classify the anti-cancer drug response as "sensitive" or "resistant". We made three contributions in this study. Firstly, diverse molecular data could be effectively integrated to provide more information than single type of data for the classification. Combination of two types of data were tested here. Secondly, two structures based on the multi-grained scanning to transform the raw features into high-dimensional feature vectors and integrate the diverse data were proposed in our study. Thirdly, the original deep and time-consuming architecture of cascade forest was improved by a feature optimization operation, which emphasized the most discriminative features across layers. We evaluated the proposed method on the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) data sets and then compared with the Support Vector Machine. The proposed Deep-Resp-Forest has demonstrated the promising use of deep learning and deep forest approach on the drug response prediction tasks. The R implementation for running our experiments is available athttps://github.com/RanSuLab/Deep-Resp-Forest.

[1]  Xinying Xu,et al.  An Ameliorated Prediction of Drug–Target Interactions Based on Multi-Scale Discrete Wavelet Transform and Network Features , 2017, International journal of molecular sciences.

[2]  Andrew J. Wilson,et al.  Gene expression profiling-based prediction of response of colon carcinoma cells to 5-fluorouracil and camptothecin. , 2003, Cancer research.

[3]  Arshdeep Sekhon,et al.  DeepDiff: DEEP‐learning for predicting DIFFerential gene expression from histone modifications , 2018, Bioinform..

[4]  Levi A Garraway,et al.  Genomics-driven oncology: framework for an emerging paradigm. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[5]  Wei Lin,et al.  A comprehensive overview and evaluation of circular RNA detection tools , 2017, PLoS Comput. Biol..

[6]  N. Paneth,et al.  Seven Questions for Personalized Medicine. , 2015, JAMA.

[7]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[8]  Makoto Miwa,et al.  Extracting Drug-Drug Interactions with Attention CNNs , 2017, BioNLP.

[9]  Dong Wang,et al.  iLoc‐lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC , 2018, Bioinform..

[10]  Fei Guo,et al.  Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier , 2017, Artif. Intell. Medicine.

[11]  Xing Gao,et al.  Integration of deep feature representations and handcrafted features to improve the prediction of N6-methyladenosine sites , 2019, Neurocomputing.

[12]  Balachandran Manavalan,et al.  iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree , 2018, Computational and structural biotechnology journal.

[13]  Yang Guo,et al.  BCDForest: a boosting cascade deep forest model towards the classification of cancer subtypes based on gene expression data , 2018, BMC Bioinformatics.

[14]  Ran Su,et al.  M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning , 2018, Molecular therapy. Nucleic acids.

[15]  Adam A. Margolin,et al.  The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity , 2012, Nature.

[16]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Tae Soon Kim,et al.  Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature , 2018, Scientific Reports.

[18]  M. Verma Personalized Medicine and Cancer , 2012, Journal of personalized medicine.

[19]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[20]  Doheon Lee,et al.  Context-specific functional module based drug efficacy prediction , 2016, BMC Bioinformatics.

[21]  D. Dexter,et al.  Tumor heterogeneity and drug resistance. , 1986, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Rong Chen,et al.  HBPred: a tool to identify growth hormone-binding proteins , 2018, International journal of biological sciences.

[24]  Ming Wen,et al.  Deep-Learning-Based Drug-Target Interaction Prediction. , 2017, Journal of proteome research.

[25]  Mohamed Batouche,et al.  Drug-Target Interaction Prediction in Drug Repositioning Based on Deep Semi-Supervised Learning , 2018, CIIA.

[26]  Justin Guinney,et al.  Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data , 2013, Pacific Symposium on Biocomputing.

[27]  Ivan Rusyn,et al.  Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches. , 2011, Chemical research in toxicology.

[28]  Jiangning Song,et al.  ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides , 2018, Bioinform..

[29]  Yu Zhang,et al.  Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Lodewyk F. A. Wessels,et al.  TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types , 2016, Bioinform..

[31]  Jijun Tang,et al.  Identification of drug-side effect association via multiple information integration with centered kernel alignment , 2019, Neurocomputing.

[32]  Krister Wennerberg,et al.  Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix Factorization , 2014, J. Chem. Inf. Model..

[33]  Leyi Wei,et al.  A novel hierarchical selective ensemble classifier with bioinformatics application , 2017, Artif. Intell. Medicine.

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  S. Ramaswamy,et al.  Systematic identification of genomic markers of drug sensitivity in cancer cells , 2012, Nature.

[36]  Wen-Chi Chou,et al.  An integrated transcriptomic and computational analysis for biomarker identification in gastric cancer , 2010, Nucleic acids research.

[37]  Chun Xing Li,et al.  Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection , 2015, BMC Cancer.

[38]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[39]  Ao Li,et al.  A novel heterogeneous network-based method for drug response prediction in cancer cell lines , 2018, Scientific Reports.

[40]  Jun Wang,et al.  Predicting Anticancer Drug Responses Using a Dual-Layer Integrated Cell Line-Drug Network Model , 2015, PLoS Comput. Biol..

[41]  Jijun Tang,et al.  Identification of drug-target interactions via multiple information integration , 2017, Inf. Sci..

[42]  Jianfeng Pei,et al.  Deep Learning Based Regression and Multiclass Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction , 2017, J. Chem. Inf. Model..

[43]  Zhao Li,et al.  Learning from real imbalanced data of 14-3-3 proteins binding specificity , 2016, Neurocomputing.

[44]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[45]  Sridhar Ramaswamy,et al.  Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells , 2012, Nucleic Acids Res..

[46]  Xiaofeng Liu,et al.  Developing a Multi-Dose Computational Model for Drug-Induced Hepatotoxicity Prediction Based on Toxicogenomics Data , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[47]  Gwang Lee,et al.  PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine , 2018, Front. Microbiol..

[48]  Q. Zou,et al.  Similarity computation strategies in the microRNA-disease network: a survey. , 2015, Briefings in functional genomics.

[49]  Wei Chen,et al.  Recent Advances in Conotoxin Classification by Using Machine Learning Methods , 2017, Molecules.

[50]  Xiangxiang Zeng,et al.  Probability-based collaborative filtering model for predicting gene–disease associations , 2017, BMC Medical Genomics.

[51]  Sergey Plis,et al.  Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data. , 2016, Molecular pharmaceutics.

[52]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[53]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[54]  Balachandran Manavalan,et al.  DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest , 2017, bioRxiv.

[55]  Günter Klambauer,et al.  DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..