Statistical controversies in clinical research: scientific and ethical problems with adaptive randomization in comparative clinical trials.

BACKGROUND In recent years, various outcome adaptive randomization (AR) methods have been used to conduct comparative clinical trials. Rather than randomizing patients equally between treatments, outcome AR uses the accumulating data to unbalance the randomization probabilities in favor of the treatment arm that currently is superior empirically. This is motivated by the idea that, on average, more patients in the trial will be given the treatment that is truly superior, so AR is ethically more desirable than equal randomization. AR remains controversial, however, and some of its properties are not well understood by the clinical trials community. MATERIALS AND METHODS Computer simulation was used to evaluate properties of a 200-patient clinical trial conducted using one of four Bayesian AR methods and compare them to an equally randomized group sequential design. RESULTS Outcome AR has several undesirable properties. These include a high probability of a sample size imbalance in the wrong direction, which might be surprising to nonstatisticians, wherein many more patients are assigned to the inferior treatment arm, the opposite of the intended effect. Compared with an equally randomized design, outcome AR produces less reliable final inferences, including a greatly overestimated actual treatment effect difference and smaller power to detect a treatment difference. This estimation bias becomes much larger if the prognosis of the accrued patients either improves or worsens systematically during the trial. CONCLUSIONS AR produces inferential problems that decrease potential benefit to future patients, and may decrease benefit to patients enrolled in the trial. These problems should be weighed against its putative ethical benefit. For randomized comparative trials to obtain confirmatory comparisons, designs with fixed randomization probabilities and group sequential decision rules appear to be preferable to AR, scientifically, and ethically.

[1]  J. Kimmelman,et al.  Are outcome-adaptive allocation trials ethical? , 2015, Clinical trials.

[2]  G. Yin,et al.  Worth Adapting? Revisiting the Usefulness of Outcome-Adaptive Randomization , 2012, Clinical Cancer Research.

[3]  Peter F Thall,et al.  Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer , 2012, Journal of the American Statistical Association.

[4]  Ying Yuan,et al.  On the usefulness of outcome-adaptive randomization. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[5]  B. Freidlin,et al.  Outcome--adaptive randomization: is it useful? , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[6]  P. Arlen,et al.  Development of a new stool biomarker ELISA for the early detection of colorectal cancer. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[7]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies , 2010, Statistics in medicine.

[8]  Kevin J. Anstrom,et al.  Using Inverse Probability-Weighted Estimators in Comparative Effectiveness Analyses With Observational Databases , 2007, Medical care.

[9]  Nandini Dendukuri,et al.  Multiple imputation for correcting verification bias by Ofer Harel and Xiao‐Hua Zhou, Statistics in Medicine 2006; 25:3769–3786 , 2007, Statistics in medicine.

[10]  P. Thall,et al.  Practical Bayesian adaptive randomisation in clinical trials. , 2007, European journal of cancer.

[11]  W. Rosenberger,et al.  The theory of response-adaptive randomization in clinical trials , 2006 .

[12]  S. Morgan,et al.  Matching Estimators of Causal Effects , 2006 .

[13]  Peter F Thall,et al.  Continuous Bayesian adaptive randomization based on event times with covariates , 2006, Statistics in medicine.

[14]  Dezheng Huo,et al.  A group sequential, response-adaptive design for randomized clinical trials. , 2003, Controlled clinical trials.

[15]  J. Robins,et al.  Effect of highly active antiretroviral therapy on time to acquired immunodeficiency syndrome or death using marginal structural models. , 2003, American journal of epidemiology.

[16]  S. Cole,et al.  Use of a marginal structural model to determine the effect of aspirin on cardiovascular mortality in the Physicians' Health Study. , 2002, American journal of epidemiology.

[17]  J. Robins,et al.  Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. , 2000, Epidemiology.

[18]  W. J. Hall,et al.  Unbiased estimation following a group sequential test , 1999 .

[19]  U. Strömberg Should an expected number of cases be an integer? , 1994, Epidemiology.

[20]  J M Lachin,et al.  The use of response-adaptive designs in clinical trials. , 1993, Controlled clinical trials.

[21]  T. Fleming,et al.  Parameter estimation following group sequential hypothesis testing , 1990 .

[22]  John Whitehead,et al.  On the bias of maximum likelihood estimation following a sequential test , 1986 .

[23]  A A Tsiatis,et al.  Exact confidence intervals following a group sequential test. , 1984, Biometrics.

[24]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[25]  M. Zelen A new design for randomized clinical trials. , 1979, The New England journal of medicine.

[26]  L. J. Wei,et al.  The Randomized Play-the-Winner Rule in Medical Trials , 1978 .

[27]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[28]  J. Kyle Wathen,et al.  Some caveats for outcome adaptive randomization in clinical trials , 2015 .

[29]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[30]  John Weiner,et al.  Letter to the Editor , 1992, SIGIR Forum.