The Sequential Design of Bernoulli Experiments Including Switching Costs

We consider a sequence of N trials, each of which must be performed on one of two given Bernoulli experiments. We assume the success probability of one experiment is known and the other is unknown. Performing two successive trials on different experiments incurs a switching cost. The problem is to choose an experiment for each trial in order to maximize the expected number of successes minus the expected switching costs. We show that an optimal design shares some well-known monotonicity properties, such as the “stopping-rule” and the “stay-on-a-winner” rule. We also show how to use these results to derive a simple algorithm for calculating the optimal design. Since the model contains the one-armed bandit problem as a special case, we also obtain new proofs for known results.