The Multi-Armed Bandit Problem under Delayed Rewards Conditions in Digital Campaign Management

In this paper, we account for a digital marketing content recommendation system, called campaign management, used by marketers to create specific digital content that can be issued or configured for viewing by certain population segments according to a series of business variables, user profile or behavior. We analyze the most representative allocation strategies to deal with the multi-armed bandit problem in a context with delayed rewards by means of a numerical study based on a discrete event simulation. Both batch mode and online update architectures are considered for feedback from the different contents displayed to users.