Test & Roll: Profit-Maximizing A/B Tests

Marketers often use A/B testing as a tool to compare marketing treatments in a test stage and then deploy the better-performing treatment to the remainder of the consumer population. Whereas these tests have traditionally been analyzed using hypothesis testing, we reframe them as an explicit trade-off between the opportunity cost of the test (where some customers receive a suboptimal treatment) and the potential losses associated with deploying a suboptimal treatment to the remainder of the population. We derive a closed-form expression for the profit-maximizing test size and show that it is substantially smaller than typically recommended for a hypothesis test, particularly when the response is noisy or when the total population is small. The common practice of using small holdout groups can be rationalized by asymmetric priors. The proposed test design achieves nearly the same expected regret as the flexible yet harder-to-implement multi-armed bandit under a wide range of conditions. We demonstrate the benefits of the method in three different marketing contexts—website design, display advertising, and catalog tests—in which we estimate priors from past data. In all three cases, the optimal sample sizes are substantially smaller than for a traditional hypothesis test, resulting in higher profit.