A modified bandit as an approach to ethical allocation in clinical trials

A sequential allocation rule based on an optimal strategy for a two-armed bandit problem is proposed for use in the problem of identifying the better of two treatment alternatives in clinical trials. The purpose of the rule is to ensure more ethical allocation of patients while retaining a given probability of correctly selecting the better treatment at the close of the trial. The behavior of the bandit rule is compared with two other commonly proposed allocation rules: play-the-winner and vector-at-a-time. It is found that, in general, the bandit rule performs as well as, and usually remarkably better than, both of the other allocation rules. All comparisons are based on exact computations using forward induction algorithms carried out on desktop workstations.