Inference for non-regular parameters in optimal dynamic treatment regimes

A dynamic treatment regime is a set of decision rules, one per stage, each taking a patient’s treatment and covariate history as input, and outputting a recommended treatment. In the estimation of the optimal dynamic treatment regime from longitudinal data, the treatment effect parameters at any stage prior to the last can be non-regular under certain distributions of the data. This results in biased estimates and invalid confidence intervals for the treatment effect parameters. In this article, we discuss both the problem of non-regularity, and available estimation methods. We provide an extensive simulation study to compare the estimators in terms of their ability to lead to valid confidence intervals under a variety of non-regular scenarios. Analysis of a data set from a smoking cessation trial is provided as an illustration.

[1]  John C. Nankervis,et al.  Computational algorithms for double bootstrap confidence intervals , 2005, Comput. Stat. Data Anal..

[2]  C. Watkins Learning from delayed rewards , 1989 .

[3]  S. Murphy,et al.  An experimental design for the development of adaptive treatment strategies , 2005, Statistics in medicine.

[4]  B. Freedman Equipoise and the ethics of clinical research. , 1987, The New England journal of medicine.

[5]  Marie Davidian,et al.  Estimation of Survival Distributions of Treatment Policies in Two‐Stage Randomization Designs in Clinical Trials , 2002, Biometrics.

[6]  Robert D. Nowak,et al.  Wavelet-based image estimation: an empirical Bayes approach using Jeffrey's noninformative prior , 2001, IEEE Trans. Image Process..

[7]  Erica E M Moodie,et al.  Demystifying Optimal Dynamic Treatment Regimes , 2007, Biometrics.

[8]  D. Andrews Inconsistency of the Bootstrap when a Parameter is on the Boundary of the Parameter Space , 2000 .

[9]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[10]  P. Bickel,et al.  ON THE CHOICE OF m IN THE m OUT OF n BOOTSTRAP AND CONFIDENCE BOUNDS FOR EXTREMA , 2008 .

[11]  Mário A. T. Figueiredo,et al.  Wavelet-Based Image Estimation : An Empirical Bayes Approach Using Jeffreys ’ Noninformative Prior , 2001 .

[12]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[13]  P. Hall,et al.  On blocking rules for the bootstrap with dependent data , 1995 .

[14]  Susan A. Murphy,et al.  A Conceptual Framework for Adaptive Preventive Interventions , 2004, Prevention Science.

[15]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[16]  Ree Dawson,et al.  Dynamic treatment regimes: practical design considerations , 2004, Clinical trials.

[17]  Sarah M. Greene,et al.  Web-based smoking-cessation programs: results of a randomized trial. , 2008, American journal of preventive medicine.

[18]  H. Sung,et al.  Evaluating multiple treatment courses in clinical trials. , 2000, Statistics in medicine.

[19]  Susan A. Murphy,et al.  A Generalization Error for Q-Learning , 2005, J. Mach. Learn. Res..

[20]  Jian-Hua Shao,et al.  Bootstrap Sample Size in Nonregular Cases , 1994 .

[21]  Anastasios A. Tsiatis,et al.  Semiparametric efficient estimation of survival distributions in two-stage randomisation designs in clinical trials with censored data , 2006 .

[22]  Peter F Thall,et al.  Bayesian and frequentist two‐stage treatment strategies based on sequential failure times subject to interval censoring , 2007, Statistics in medicine.

[23]  C. F. Jeff Wu,et al.  Experiments: Planning, Analysis, and Parameter Design Optimization , 2000 .

[24]  Philip W. Lavori,et al.  A design for testing clinical strategies: biased adaptive within‐subject randomization , 2000 .

[25]  A. Tsiatis,et al.  Optimal Estimator for the Survival Distribution and Related Quantities for Treatment Policies in Two‐Stage Randomization Designs in Clinical Trials , 2004, Biometrics.

[26]  P. Lavori,et al.  Placebo‐free designs for evaluating new mental health treatments: the use of adaptive treatment strategies , 2004, Statistics in medicine.

[27]  Erica E M Moodie,et al.  Estimating Optimal Dynamic Regimes: Correcting Bias under the Null , 2009, Scandinavian journal of statistics, theory and applications.

[28]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[29]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[30]  K. Davis,et al.  National Institute of Mental Health Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE): Alzheimer disease trial methodology. , 2001, The American journal of geriatric psychiatry : official journal of the American Association for Geriatric Psychiatry.

[31]  I. Johnstone,et al.  Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences , 2004, math/0410088.

[32]  H. Sung,et al.  Selecting Therapeutic Strategies Based on Efficacy and Death in Multicourse Clinical Trials , 2002 .

[33]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[34]  Hong-Ye Gao,et al.  Wavelet Shrinkage Denoising Using the Non-Negative Garrote , 1998 .

[35]  Andrea Cavalleri,et al.  All at Once , 2007, Science.

[36]  S. Murphy,et al.  Optimal dynamic treatment regimes , 2003 .

[37]  D. Kupfer,et al.  Sequenced treatment alternatives to relieve depression (STAR*D): rationale and design. , 2004, Controlled clinical trials.

[38]  James M. Robins,et al.  Optimal Structural Nested Models for Optimal Sequential Decisions , 2004 .

[39]  J. Robins Correcting for non-compliance in randomized trials using structural nested mean models , 1994 .