Variable selection for optimal treatment decision

In decision-making on optimal treatment strategies, it is of great importance to identify variables that are involved in the decision rule, i.e. those interacting with the treatment. Effective variable selection helps to improve the prediction accuracy and enhance the interpretability of the decision rule. We propose a new penalized regression framework which can simultaneously estimate the optimal treatment strategy and identify important variables. The advantages of the new approach include: (i) it does not require the estimation of the baseline mean function of the response, which greatly improves the robustness of the estimator; (ii) the convenient loss-based framework makes it easier to adopt shrinkage methods for variable selection, which greatly facilitates implementation and statistical inferences for the estimator. The new procedure can be easily implemented by existing state-of-art software packages like LARS. Theoretical properties of the new estimator are studied. Its empirical performance is evaluated using simulation studies and further illustrated with an application to an AIDS clinical trial.

[1]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[2]  S. Murphy,et al.  An experimental design for the development of adaptive treatment strategies , 2005, Statistics in medicine.

[3]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[4]  Hansheng Wang,et al.  Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso , 2007 .

[5]  Hao Helen Zhang,et al.  Adaptive Lasso for Cox's proportional hazards model , 2007 .

[6]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[7]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[10]  Marie Davidian,et al.  Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates , 2008, Biometrics.

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  S. Murphy,et al.  PERFORMANCE GUARANTEES FOR INDIVIDUALIZED TREATMENT RULES. , 2011, Annals of statistics.

[13]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[14]  M. Davidian,et al.  Covariate adjustment for two‐sample treatment comparisons in randomized clinical trials: A principled yet flexible approach , 2008, Statistics in medicine.

[15]  S. Hammer,et al.  A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. AIDS Clinical Trials Group Study 175 Study Team. , 1996, The New England journal of medicine.

[16]  L. Staudt,et al.  Stromal gene signatures in large-B-cell lymphomas. , 2008, The New England journal of medicine.

[17]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[18]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[19]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[20]  Susan Murphy,et al.  Inference for non-regular parameters in optimal dynamic treatment regimes , 2010, Statistical methods in medical research.

[21]  Jason Brinkley,et al.  A Generalized Estimator of the Attributable Benefit of an Optimal Treatment Regime , 2010, Biometrics.

[22]  M. Kosorok,et al.  Reinforcement learning design for cancer clinical trials , 2009, Statistics in medicine.

[23]  S. Murphy,et al.  Variable Selection for Qualitative Interactions. , 2011, Statistical methodology.

[24]  Jianqing Fan,et al.  New Estimation and Model Selection Procedures for Semiparametric Modeling in Longitudinal Data Analysis , 2004 .

[25]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .