论文信息 - The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems

The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems

Online allocation problems with resource constraints are central problems in revenue management and online advertising. In these problems, requests arrive sequentially during a finite horizon and, for each request, a decision maker needs to choose an action that consumes a certain amount of resources and generates reward. The objective is to maximize cumulative rewards subject to a constraint on the total consumption of resources. In this paper, we consider a data-driven setting in which the reward and resource consumption of each request are generated using an input model that is unknown to the decision maker. We design a general class of algorithms that attain good performance in various inputs models without knowing which type of input they are facing. In particular, our algorithms are asymptotically optimal under stochastic i.i.d. input model as well as various non-stationary stochastic input models, and they attain an asymptotically optimal fixed competitive ratio when the input is adversarial. Our algorithms operate in the Lagrangian dual space: they maintain a dual multiplier for each resource that is updated using online mirror descent. By choosing the reference function accordingly, we recover dual sub-gradient descent and dual exponential weights algorithm. The resulting algorithms are simple, fast, and have minimal requirements on the reward functions, consumption functions and the action space, in contrast to existing methods for online allocation problems. We discuss applications to network revenue management, online bidding in repeated auctions with budget constraints, online proportional matching with high entropy, and personalized assortment optimization with limited inventories.

Vahab Mirrokni | Haihao Lu | Santiago Balseiro

[1] SaberiAmin,et al. AdWords and generalized online matching , 2007 .

[2] Jon Feldman,et al. Yield optimization of display advertising with ad exchange , 2011, EC '11.

[3] André de Palma,et al. Discrete Choice Theory of Product Differentiation , 1995 .

[4] Yurii Nesterov,et al. Relatively Smooth Convex Optimization by First-Order Methods, and Applications , 2016, SIAM J. Optim..

[5] Andrew Chi-Chih Yao,et al. Probabilistic computations: Toward a unified measure of complexity , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[6] Zizhuo Wang,et al. A Dynamic Near-Optimal Algorithm for Online Linear Programming , 2009, Oper. Res..

[7] Dimitri P. Bertsekas,et al. Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[8] Anupam Gupta,et al. Robust Algorithms for the Secretary Problem , 2019, ITCS.

[9] Stefanus Jasin,et al. Performance of an LP-Based Control for Revenue Management with Unknown Demand Parameters , 2015, Oper. Res..

[10] Morteza Zadimoghaddam,et al. Proportional Allocation: Simple, Distributed, and Diverse Matching with High Entropy , 2018, ICML.

[11] Joseph Naor,et al. Online Primal-Dual Algorithms for Maximizing Ad-Auctions Revenue , 2007, ESA.

[12] Maurice Queyranne,et al. Toward Robust Revenue Management: Competitive Analysis of Online Booking , 2009, Oper. Res..

[13] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[14] Shaddin Dughmi,et al. Bernoulli factories and black-box reductions in mechanism design , 2017, SECO.

[15] Yinyu Ye,et al. Online Linear Programming: Dual Convergence, New Algorithms, and Regret Bounds , 2019, ArXiv.

[16] Hamid Nazerzadeh,et al. Real-time optimization of personalized assortments , 2013, EC '13.

[17] Jian Xu,et al. Smart Pacing for Effective Online Ad Campaign Optimization , 2015, KDD.

[18] Vahab S. Mirrokni,et al. Dual Mirror Descent for Online Allocation Problems , 2020, ICML.

[19] F. Glover,et al. The Passenger-Mix Problem in the Scheduled Airlines , 1982 .

[20] Itay Gurvich,et al. Uniformly bounded regret in the multi-secretary problem , 2017, Stochastic Systems.

[21] Ali Jalali,et al. Real time bid optimization with smooth budget delivery in online advertising , 2013, ADKDD '13.

[22] Morteza Zadimoghaddam,et al. Simultaneous approximations for adversarial and stochastic online budgeted allocation , 2012, SODA.

[23] Michael I. Jordan,et al. Ergodic mirror descent , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[24] Haihao Lu. “Relative Continuity” for Non-Lipschitz Nonsmooth Convex Optimization Using Stochastic (or Deterministic) Mirror Descent , 2017, INFORMS Journal on Optimization.

[25] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[26] V. Farias,et al. Dynamic Allocation Problems with Volatile Demand , 2011 .

[27] Nikhil R. Devanur,et al. Near optimal online algorithms and fast approximation algorithms for resource allocation problems , 2011, EC '11.

[28] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[29] Rong Jin,et al. Robust Online Matching with User Arrival Distribution Drift , 2019, AAAI.

[30] Lei Xie,et al. Dynamic Assortment Customization with Limited Inventories , 2015, Manuf. Serv. Oper. Manag..

[31] Marc Teboulle,et al. Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions , 1993, SIAM J. Optim..

[32] K W Wang. OPTIMUM SEAT ALLOCATION FOR MULTI-LEG FLIGHTS WITH MULTIPLE FARE TYPES , 1983 .

[33] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .

[34] Mark Fuge,et al. Diverse Weighted Bipartite b-Matching , 2017, IJCAI.

[35] Xiaoyan Zhu,et al. Promoting Diversity in Recommendation by Entropy Regularizer , 2013, IJCAI.

[36] Akshay Krishnamurthy,et al. Robust Dynamic Assortment Optimization in the Presence of Outlier Customers , 2019, ArXiv.

[37] S. Kakade,et al. On the duality of strong convexity and strong smoothness : Learning applications and matrix regularization , 2009 .

[38] Omar Besbes,et al. Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards , 2014, NIPS.

[39] Jon Feldman,et al. Online allocation of display ads with smooth delivery , 2012, KDD.

[40] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[41] Deeparnab Chakrabarty,et al. Budget constrained bidding in keyword auctions and online knapsack problems , 2008, WINE.

[42] Nikhil R. Devanur,et al. An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives , 2015, COLT.

[43] Nikhil R. Devanur,et al. Fast Algorithms for Online Stochastic Convex Programming , 2014, SODA.

[44] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[45] Renato Paes Leme,et al. Stochastic bandits robust to adversarial corruptions , 2018, STOC.

[46] Ashutosh Sabharwal,et al. An Axiomatic Theory of Fairness in Network Resource Allocation , 2009, 2010 Proceedings IEEE INFOCOM.

[47] Jon Feldman,et al. Online Stochastic Packing Applied to Display Ad Allocation , 2010, ESA.

[48] Frank Nielsen,et al. Bregman Voronoi Diagrams , 2007, Discret. Comput. Geom..

[49] Omar Besbes,et al. Non-Stationary Stochastic Optimization , 2013, Oper. Res..

[50] W. Lieberman. The Theory and Practice of Revenue Management , 2005 .

[51] Venkat Venkatasubramanian,et al. Fairness is an Emergent Self-Organized Property of the Free Market for Labor , 2010, Entropy.

[52] Gabriel R. Bitran,et al. An overview of pricing models for revenue management , 2003, IEEE Engineering Management Review.

[53] John Langford,et al. Resourceful Contextual Bandits , 2014, COLT.

[54] Jon Feldman,et al. Online Ad Assignment with Free Disposal , 2009, WINE.

[55] Thomas P. Hayes,et al. The adwords problem: online keyword matching with budgeted bidders under random permutations , 2009, EC '09.

[56] Yinyu Ye,et al. Simple and fast algorithm for binary integer and online linear programming , 2020, Mathematical Programming.

[57] Yonatan Gur,et al. Learning in Repeated Auctions with Budgets: Regret Minimization and Equilibrium , 2017, EC.