Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms

We consider a single-product revenue management problem where, given an initial inventory, the objective is to dynamically adjust prices over a finite sales horizon to maximize expected revenues. Realized demand is observed over time, but the underlying functional relationship between price and mean demand rate that governs these observations (otherwise known as the demand function or demand curve) is not known. We consider two instances of this problem: (i) a setting where the demand function is assumed to belong to a known parametric family with unknown parameter values; and (ii) a setting where the demand function is assumed to belong to a broad class of functions that need not admit any parametric representation. In each case we develop policies that learn the demand function “on the fly,” and optimize prices based on that. The performance of these algorithms is measured in terms of the regret: the revenue loss relative to the maximal revenues that can be extracted when the demand function is known prior to the start of the selling season. We derive lower bounds on the regret that hold for any admissible pricing policy, and then show that our proposed algorithms achieve a regret that is “close” to this lower bound. The magnitude of the regret can be interpreted as the economic value of prior knowledge on the demand function, manifested as the revenue loss due to model uncertainty.

[1]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[2]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[3]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[4]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[5]  H. Scarf Bayes Solutions of the Statistical Inventory Problem , 1959 .

[6]  H. Kushner,et al.  Stochastic approximation of constrained systems with system and constraint noise , 1975, at - Automatisierungstechnik.

[7]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[8]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[9]  G. Ryzin,et al.  Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .

[10]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[11]  G. Ryzin,et al.  Revenue Management Without Forecasting or Optimization: An Adaptive Algorithm for Determining Airline Seat Protection Levels , 2000 .

[12]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[13]  Yossi Aviv,et al.  Pricing of Short Life-Cycle Products through Active Learning∗ , 2002 .

[14]  Benjamin Van Roy,et al.  A Non-Parametric Approach to Multi-Product Pricing , 2003 .

[15]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[16]  Pinar Keskinocak,et al.  Dynamic pricing in the presence of inventory considerations: research overview, current practices, and future directions , 2003, IEEE Engineering Management Review.

[17]  S. Boyd,et al.  Pricing and learning with uncertain demand , 2003 .

[18]  K. Talluri,et al.  The Theory and Practice of Revenue Management , 2004 .

[19]  W. Lieberman The Theory and Practice of Revenue Management , 2005 .

[20]  V. F. Araman,et al.  Dynamic Pricing for Non-Perishable Products with Demand Learning , 2005 .

[21]  D. Hearn,et al.  Mathematical and Computational Models for Congestion Charging , 2006 .

[22]  Peter W. Glynn,et al.  A Nonparametric Approach to Multiproduct Pricing , 2006, Oper. Res..

[23]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[24]  D. Bertsimas,et al.  Working Paper , 2022 .

[25]  Andrew E. B. Lim,et al.  Relative Entropy, Exponential Utility, and Robust Dynamic Pricing , 2007, Oper. Res..

[26]  G. Perakis Conference review: Editorial on Special Issue based on the INFORMS Conference, June 2006, New York City , 2007 .

[27]  H. Robbins A Stochastic Approximation Method , 1951 .

[28]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[29]  Victor F. Araman,et al.  Dynamic Pricing for Nonperishable Products with Demand Learning , 2009, Oper. Res..

[30]  Maurice Queyranne,et al.  Toward Robust Revenue Management: Competitive Analysis of Online Booking , 2009, Oper. Res..

[31]  Georgia Perakis,et al.  Robust Controls for Network Revenue Management , 2010, Manuf. Serv. Oper. Manag..

[32]  Benjamin Van Roy,et al.  Dynamic Pricing with a Prior on Market Response , 2010, Oper. Res..

[33]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .