Online Auctions and Multi-scale Online Learning

We consider revenue maximization in online auctions and pricing. A seller sells an identical item in each period to a new buyer, or a new set of buyers. For the online posted pricing problem, we show regret bounds that scale with the best fixed price, rather than the range of the values. We also show regret bounds that are almost scale free, and match the offline sample complexity, when comparing to a benchmark that requires a lower bound on the market share. These results are obtained by generalizing the classical learning from experts and multi-armed bandit problems to their multi-scale versions. In this version, the reward of each action is in a different range, and the regret w.r.t. a given action scales with its own range, rather than the maximum range.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  A. V. den Boer,et al.  Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[3]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[4]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[5]  Nikhil R. Devanur,et al.  Bandits with concave rewards and convex knapsacks , 2014, EC.

[6]  Yannai A. Gonczarowski,et al.  Efficient empirical revenue maximization in single-parameter auction environments , 2016, STOC.

[7]  Tim Roughgarden,et al.  On the Pseudo-Dimension of Nearly Optimal Auctions , 2015, NIPS.

[8]  Aleksandrs Slivkins,et al.  Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[9]  Vasilis Syrgkanis A Sample Complexity Measure with Applications to Learning Optimal Auctions , 2017, NIPS.

[10]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[11]  Tim Roughgarden,et al.  The Pseudo-Dimension of Near-Optimal Auctions , 2015, NIPS 2015.

[12]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[13]  Umar Syed,et al.  Learning Prices for Repeated Auctions with Strategic Buyers , 2013, NIPS.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  Nikhil R. Devanur,et al.  The sample complexity of auctions with side information , 2015, STOC.

[16]  Felix Wu,et al.  Incentive-compatible online auctions for digital goods , 2002, SODA '02.

[17]  Tim Roughgarden,et al.  Making the Most of Your Samples , 2014, EC.

[18]  Avrim Blum,et al.  Near-optimal online auctions , 2005, SODA '05.

[19]  V. Rich Personal communication , 1989, Nature.

[20]  Ilya Segal,et al.  Optimal Pricing Mechanisms with Unknown Demand , 2002 .

[21]  W. Lieberman The Theory and Practice of Revenue Management , 2005 .

[22]  Sébastien Bubeck,et al.  Introduction to Online Optimization , 2011 .

[23]  Vijay Kumar,et al.  Online learning in online auctions , 2003, SODA '03.

[24]  Tim Roughgarden,et al.  Revenue maximization with a single sample , 2010, EC '10.

[25]  Richard Cole,et al.  The sample complexity of revenue maximization , 2014, STOC.

[26]  Moshe Babaioff,et al.  Dynamic Pricing with Limited Supply , 2011, ACM Trans. Economics and Comput..

[27]  Tim Roughgarden,et al.  Ironing in the Dark , 2015, EC.

[28]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[29]  Yuval Peres,et al.  Perfect Bayesian Equilibria in Repeated Sales , 2014, SODA.