Discussion of “Combining biomarkers to optimize patient treatment recommendation”

We congratulate the Kang, Janes, and Huang (hereafter KJH) on an interesting and powerful new method for estimating an optimal treatment rule, also referred to as an optimal treatment regime. Their proposed method relies on having a high-quality estimator for the regression of outcome on biomarkers and treatment, which the authors obtain using a novel boosting algorithm. Methods for constructing treatment rules/regimes that rely on outcome models are sometimes called indirect or regression-based methods because the treatment rule is inferred from the outcome model (Barto and Dieterich, 1988). Regression-based methods are appealing because they can be used to make prognostic predictions as well as treatment recommendations. While it is common practice to use parametric or semiparametric models in regression-based approaches (Robins, 2004; Chakraborty and Moodie, 2013; Laber et al., 2014; Schulte et al., 2014), there is growing interest in using nonparametric methods to avoid model misspecification (Zhao et al., 2011; Moodie et al., 2013). In contrast, direct estimation methods, also known as policy-search methods, try to weaken or eliminate dependence on correct outcome models and instead attempt to search for the best treatment rule within a pre-specified class of rules (Orellana, Rotnitzky, and Robins, 2010; Zhang et al., 2012a,b; Zhao et al., 2012; Zhang et al., 2013). Direct estimation methods make fewer assumptions about the outcome model, which may make them more robust to model misspecification but potentially more variable. We derive a direct estimation analog to the method of KJH, which we term value boosting. The method is based on recasting the problem of estimating an optimal treatment rule as a weighted classification problem (Zhang et al., 2012a; Zhao et al., 2012). We show how the method of KJH can be used with existing policy-search methods to construct a treatment rule that is interpretable, logistically feasible, parsimonious, or otherwise appealing.

[1]  M. J. Laan,et al.  Optimal Dynamic Treatments in Resource-Limited Settings , 2015 .

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  M. Kosorok,et al.  Reinforcement learning design for cancer clinical trials , 2009, Statistics in medicine.

[4]  Nema Dean,et al.  Q-Learning: Flexible Learning About Useful Utilities , 2013, Statistics in Biosciences.

[5]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[6]  Donglin Zeng,et al.  Estimating Individualized Treatment Rules Using Outcome Weighted Learning , 2012, Journal of the American Statistical Association.

[7]  M. Kosorok,et al.  Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer , 2011, Biometrics.

[8]  Anastasios A. Tsiatis,et al.  Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes , 2012, Statistical science : a review journal of the Institute of Mathematical Statistics.

[9]  Eric B. Laber,et al.  A Robust Method for Estimating Optimal Treatment Regimes , 2012, Biometrics.

[10]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[11]  Mark Culp,et al.  ada: An R Package for Stochastic Boosting , 2006 .

[12]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[13]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[14]  H. Zou,et al.  NEW MULTICATEGORY BOOSTING ALGORITHMS BASED ON MULTICATEGORY FISHER-CONSISTENT LOSSES. , 2008, The annals of applied statistics.

[15]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[16]  Eric B. Laber,et al.  Interactive model building for Q-learning. , 2014, Biometrika.

[17]  Yoav Freund,et al.  Boosting: Foundations and Algorithms , 2012 .

[18]  Min Zhang,et al.  Estimating optimal treatment regimes from a classification perspective , 2012, Stat.

[19]  Yoav Freund,et al.  An Adaptive Version of the Boost by Majority Algorithm , 1999, COLT '99.

[20]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[21]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[22]  S. Murphy,et al.  PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.

[23]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[24]  Marie Davidian,et al.  Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions. , 2013, Biometrika.

[25]  B. Chakraborty,et al.  Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine , 2013 .

[26]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[27]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[28]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[29]  J. Robins,et al.  The International Journal of Biostatistics CAUSAL INFERENCE Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes , Part I : Main Content , 2011 .