Improving drug sensitivity predictions in precision medicine through active expert knowledge elicitation

Predicting the efficacy of a drug for a given individual, using highdimensional genomic measurements, is at the core of precision medicine. However, identifying features on which to base the predictions remains a challenge, especially when the sample size is small. Incorporating expert knowledge offers a promising alternative to improve a prediction model, but collecting such knowledge is laborious to the expert if the number of candidate features is very large. We introduce a probabilistic model that can incorporate expert feedback about the impact of genomic measurements on the sensitivity of a cancer cell for a given drug. We also present two methods to intelligently collect this feedback from the expert, using experimental design and multi-armed bandit models. In a multiple myeloma blood cancer data set (n=51), expert knowledge decreased the prediction error by 8%. Furthermore, the intelligent approaches can be used to reduce the workload of feedback collection to less than 30% on average compared to a naive approach.

[1]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[2]  Zhengdong Lu Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach , 2007, AISTATS.

[3]  Paul H. Garthwaite,et al.  Quantifying Expert Opinion in Linear Regression Problems , 1988 .

[4]  D. Rubin The Bayesian Bootstrap , 1981 .

[5]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[6]  Samuel Kaski,et al.  Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction , 2016, Machine Learning.

[7]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[8]  Maria-Florina Balcan,et al.  Clustering with Interactive Feedback , 2008, ALT.

[9]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[10]  Wayne S. Smith,et al.  Interactive Elicitation of Opinion for a Normal Linear Model , 1980 .

[11]  Krister Wennerberg,et al.  Quantitative scoring of differential drug sensitivity for individually optimized anticancer therapies , 2014, Scientific Reports.

[12]  Samuel Kaski,et al.  Interactive intent modeling , 2014, Commun. ACM.

[13]  Samuel Kaski,et al.  Interactive Prior Elicitation of Feature Similarities for Small Sample Size Prediction , 2016, UMAP.

[14]  Rodrigo Dienstmann,et al.  Stepwise Group Sparse Regression (SGSR): Gene-Set-Based Pharmacogenomic Predictive Models with Stepwise Selection of Functional Priors , 2014, Pacific Symposium on Biocomputing.

[15]  Artem Sokolov,et al.  Pathway-Based Genomics Prediction using Generalized Elastic Net , 2016, PLoS Comput. Biol..

[16]  Samuel Kaski,et al.  Regression with n→1 by Expert Knowledge Elicitation , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[17]  Justin Guinney,et al.  Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data , 2013, Pacific Symposium on Biocomputing.

[18]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[19]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[20]  Fadlalla G. Elfadaly,et al.  Prior distribution elicitation for generalized linear and piecewise-linear models , 2013 .

[21]  James E. Helmreich Regression Modeling Strategies with Applications to Linear Models, Logistic and Ordinal Regression and Survival Analysis (2nd Edition) , 2016 .

[22]  Jouko Lampinen,et al.  Bayesian Model Assessment and Comparison Using Cross-Validation Predictive Densities , 2002, Neural Computation.

[23]  Hristo S. Paskov,et al.  Multitask learning improves prediction of cancer drug sensitivity , 2016, Scientific Reports.

[24]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[25]  O. Lohi,et al.  Novel activating STAT5B mutations as putative drivers of T-cell acute lymphoblastic leukemia , 2014, Leukemia.

[26]  Samuel Kaski,et al.  Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets , 2016, IUI.

[27]  Tero Aittokallio,et al.  Drug response prediction by inferring pathway-response associations with kernelized Bayesian matrix factorization , 2016, Bioinform..

[28]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[29]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[30]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[31]  Andrés Cano,et al.  A Method for Integrating Expert Knowledge When Learning Bayesian Networks From Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[32]  S. Ramaswamy,et al.  Systematic identification of genomic markers of drug sensitivity in cancer cells , 2012, Nature.

[33]  Ranadip Pal,et al.  Algorithms for Drug Sensitivity Prediction , 2016, Algorithms.

[34]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[35]  Chao Han,et al.  Bayesian visual analytics: BaVA , 2015, Stat. Anal. Data Min..

[36]  Laura M. Heiser,et al.  A community effort to assess and improve drug sensitivity prediction algorithms , 2014, Nature Biotechnology.