Regret Minimization for Reserve Prices in Second-Price Auctions

We show a regret minimization algorithm for setting the reserve price in a sequence of second-price auctions, under the assumption that all bids are independently drawn from the same unknown and arbitrary distribution. Our algorithm is computationally efficient, and achieves a regret of O(√T) in a sequence of T auctions. This holds even when the number of bidders is stochastic with a known distribution.

[1]  Tim Roughgarden,et al.  Revenue maximization with a single sample , 2010, EC '10.

[2]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[3]  Kevin Leyton-Brown,et al.  Bidding agents for online auctions with hidden bids , 2007, Machine Learning.

[4]  Rong Jin,et al.  Double Updating Online Learning , 2011, J. Mach. Learn. Res..

[5]  N. Nisan Introduction to Mechanism Design (for Computer Scientists) , 2007 .

[6]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[7]  Michael Ostrovsky,et al.  Reserve Prices in Internet Advertising Auctions: A Field Experiment , 2009, Journal of Political Economy.

[8]  Shie Mannor,et al.  Unimodal Bandits , 2011, ICML.

[9]  Anna R. Karlin,et al.  On profit maximization in mechanism design , 2007 .

[10]  Peter Auer,et al.  Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.

[11]  Craig Boutilier,et al.  Computing Reserve Prices and Identifying the Value Distribution in Real-world Auctions with Market Disruptions , 2008, AAAI.

[12]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[13]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[14]  Eric W. Cope,et al.  Regret and Convergence Bounds for a Class of Continuum-Armed Bandit Problems , 2009, IEEE Transactions on Automatic Control.

[15]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[16]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.