Managing advertising campaigns — an approximate planning approach

We consider the problem of displaying commercial advertisements on web pages, in the “cost per click” model. The advertisement server has to learn the appeal of each type of visitor for the different advertisements in order to maximize the profit. Advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. This problem is thus inherently dynamic, and intimately combines combinatorial and statistical issues. To set the stage, it is also noteworthy that we deal with very rare events of interest, since the base probability of one click is in the order of 10−4. Different approaches may be thought of, ranging from computationally demanding ones (use of Markov decision processes, or stochastic programming) to very fast ones.We introduce NOSEED, an adaptive policy learning algorithm based on a combination of linear programming and multi-arm bandits. We also propose a way to evaluate the extent to which we have to handle the constraints (which is directly related to the computation cost). We investigate the performance of our system through simulations on a realistic model designed with an important commercial web actor.

[1]  Andrei Z. Broder,et al.  Estimating rates of rare events at multiple resolutions , 2007, KDD '07.

[2]  Bee-Chung Chen,et al.  Explore/Exploit Schemes for Web Content Optimization , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[3]  Deepayan Chakrabarti,et al.  Bandits for Taxonomies: A Model-based Approach , 2007, SDM.

[4]  Deepak Agarwal,et al.  Spatio-temporal models for estimating click-through rate , 2009, WWW '09.

[5]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[6]  John Langford,et al.  The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.

[7]  Naoki Abe,et al.  Unintrusive Customization Techniques for Web Advertising , 1999, Comput. Networks.

[8]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[9]  Xuerui Wang,et al.  Click-Through Rate Estimation for Rare Events in Online Advertising , 2011 .

[10]  Chia-Hui Chang,et al.  Sentiment-oriented contextual advertising , 2009, Knowledge and Information Systems.

[11]  Naoki Abe,et al.  Improvements to the Linear Programming Based Scheduling of Web Advertisements , 2005, Electron. Commer. Res..

[12]  Aranyak Mehta,et al.  AdWords and generalized on-line matching , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[13]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[14]  Wei Li,et al.  Exploitation and exploration in a performance based contextual advertising system , 2010, KDD.

[15]  Amin Saberi,et al.  Allocating online advertisement space with unreliable estimates , 2007, EC '07.

[16]  Sandeep Pandey,et al.  Handling Advertisements of Unknown Quality in Search Advertising , 2006, NIPS.

[17]  H. Vincent Poor,et al.  Bandit problems with side observations , 2005, IEEE Transactions on Automatic Control.

[18]  Ole-Christoffer Granmo,et al.  A Bayesian Learning Automaton for Solving Two-Armed Bernoulli Bandit Problems , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[19]  Naoki Abe,et al.  Learning to Optimally Schedule Internet Banner Advertisements , 1999, ICML.

[20]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[21]  Ambuj Tewari,et al.  Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.

[22]  John Langford,et al.  Exploration scavenging , 2008, ICML '08.