Asymptotically Optimal Response‐Adaptive Designs for Allocating the Best Treatment: An Overview

Resume  Les plans d’experiences qui s’adaptent aux reponses deja acquises sont de plus en plus frequemment utilises dans les applications, et c’est tout particulierement vrai pour les essais cliniques en phase precoce. Cet article est une revue dediee a une classe de plans adaptatifs qui ont pour propriete de selectionner le meilleur traitement avec une probabilite tendant vers un. D’un point de vue ethique, ceci est une propriete souhaitable pour les essais cliniques. Pour de tels plans, le modele sous-jacent est un modele d’urne aleatoirement renforcee. Cet article presente un panorama des resultats concernant ces plans, allant de l’article seminal de Durham et Yu (1990) jusqu’au travail recent de Flournoy et al. (2010). Summary Response-adaptive designs are being used increasingly in applications, and this is especially so in early phase clinical trials. This paper reviews a particular class of response-adaptive designs that have the property of picking the superior treatment with probability tending to one. This is a desirable property from an ethical point of view in clinical trials. The model underlying such designs is a randomly reinforced urn. This paper provides an overview of results for these designs, starting from the early paper of Durham and Yu (1990) until the recent work by Flournoy, May, Moler and Plo (2010).

[1]  Svante Janson,et al.  Functional limit theorems for multitype branching processes and generalized Pólya urns , 2004 .

[2]  Giacomo Aletti,et al.  On the distribution of the limit proportion for a two-color, randomly reinforced urn with equal reinforcement distributions , 2007, Advances in Applied Probability.

[3]  S. D. Durham,et al.  A sequential design for maximizing the probability of a favourable response , 1998 .

[4]  Luigi Salmaso,et al.  Permutation Tests for Complex Data , 2010 .

[5]  Uttam Bandyopadhyay,et al.  Adaptive designs for normal responses with prognostic factors , 2001 .

[6]  William F. Rosenberger,et al.  RANDOMIZED URN MODELS AND SEQUENTIAL DESIGN , 2002 .

[7]  Nancy Flournoy RESPONSE-DRIVEN URN DESIGNS: COMMENT ON “RANDOMIZED URN MODELS AND SEQUENTIAL DESIGN” BY W. F. ROSENBERGER , 2002 .

[8]  P. Secchi,et al.  A central limit theorem, and related results, for a two-color randomly reinforced urn , 2008, Advances in Applied Probability.

[9]  Anastasia Ivanova,et al.  A play-the-winner-type urn design with reduced variability , 2003 .

[10]  P. Thall,et al.  Practical Bayesian adaptive randomisation in clinical trials. , 2007, European journal of cancer.

[11]  William F. Rosenberger,et al.  Optimality, Variability, Power , 2003 .

[12]  W. Rosenberger,et al.  The theory of response-adaptive randomization in clinical trials , 2006 .

[13]  Martin Posch,et al.  Attainability of boundary points under reinforcement learning , 2005, Games Econ. Behav..

[14]  Nancy Flournoy,et al.  On Testing Hypotheses in Response-Adaptive Designs Targeting the Best Treatment , 2010 .

[15]  M Zelen,et al.  The randomization and stratification of patients to clinical trials. , 1974, Journal of chronic diseases.

[16]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[17]  Vincent F. Melfi,et al.  Independence after adaptive allocation , 2008 .

[18]  Anna Maria Paganoni,et al.  A numerical study for comparing two response-adaptive designs for continuous treatment effects , 2007, Stat. Methods Appl..

[19]  Feifang Hu,et al.  Asymptotics in randomized urn models , 2005 .

[20]  N. Flournoy,et al.  Asymptotics in response-adaptive designs generated by a two-color, randomly reinforced urn , 2009, 0904.0350.

[21]  Feifang Hu,et al.  Asymptotic theorems of sequential estimation-adjusted urn models , 2006 .

[22]  William F Rosenberger,et al.  Response‐Adaptive Randomization for Clinical Trials with Continuous Outcomes , 2006, Biometrics.

[23]  Donald A. Berry,et al.  Bayesian nonparametric bandits , 1985 .

[24]  S. D. Durham,et al.  Randomized Play-the-Leader Rules for Sequential Sampling from Two Populations , 1990, Probability in the engineering and informational sciences (Print).

[25]  Manas K. Chattopadhyay Two-armed Dirichlet bandits with discounting , 1994 .

[26]  Alan W. Beggs,et al.  On the convergence of reinforcement learning , 2005, J. Econ. Theory.

[27]  Anna Maria Paganoni,et al.  A randomly reinforced urn , 2006 .