Learning to Price with Reference Effects

As a firm varies the price of a product, consumers exhibit reference effects, making purchase decisions based not only on the prevailing price but also the product's price history. We consider the problem of learning such behavioral patterns as a monopolist releases, markets, and prices products. This context calls for pricing decisions that intelligently trade off between maximizing revenue generated by a current product and probing to gain information for future benefit. Due to dependence on price history, realized demand can reflect delayed consequences of earlier pricing decisions. As such, inference entails attribution of outcomes to prior decisions and effective exploration requires planning price sequences that yield informative future outcomes. Despite the considerable complexity of this problem, we offer a tractable systematic approach. In particular, we frame the problem as one of reinforcement learning and leverage Thompson sampling. We also establish a regret bound that provides graceful guarantees on how performance improves as data is gathered and how this depends on the complexity of the demand model. We illustrate merits of the approach through simulations.

[1]  W. M. Kincaid,et al.  An inventory pricing problem , 1963 .

[2]  G. Ryzin,et al.  Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .

[3]  E. Greenleaf The Impact of Reference Price Effects on the Profitability of Price Promotions , 1995 .

[4]  P. Kopalle,et al.  Asymmetric Reference Price Effects and Dynamic Pricing Policies , 1996 .

[5]  Garrett J. van Ryzin,et al.  A Multiproduct Dynamic Pricing Problem and Its Applications to Network Yield Management , 1997, Oper. Res..

[6]  Gabriel R. Bitran,et al.  Coordinating Clearance Markdown Sales of Seasonal Products in Retail Chains , 1998, Oper. Res..

[7]  Malcolm J. A. Strens,et al.  A Bayesian Framework for Reinforcement Learning , 2000, ICML.

[8]  Gadi Fibich,et al.  Explicit Solutions of Optimization Models and Differential Games with Nonsmooth (Asymmetric) Reference-Price Effects , 2003, Oper. Res..

[9]  S. Boyd,et al.  Pricing and learning with uncertain demand , 2003 .

[10]  S. Raj,et al.  Reference Price Research: Review and Propositions , 2005 .

[11]  Ioana Popescu,et al.  Dynamic Pricing Strategies with Reference Effects , 2007, Oper. Res..

[12]  Hyun-Soo Ahn,et al.  Pricing and Manufacturing Decisions When Demand is a Function of Prices in Multiple Periods , 2007, Oper. Res..

[13]  Victor F. Araman,et al.  Dynamic Pricing for Nonperishable Products with Demand Learning , 2009, Oper. Res..

[14]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[15]  B. Kőszegi,et al.  Regular Prices and Sales , 2010 .

[16]  Benjamin Van Roy,et al.  Dynamic Pricing with a Prior on Market Response , 2010, Oper. Res..

[17]  Omar Besbes,et al.  Blind Network Revenue Management , 2011, Oper. Res..

[18]  A. V. den Boer,et al.  Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[19]  Benjamin Van Roy,et al.  (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.

[20]  Benjamin Van Roy,et al.  Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..

[21]  B. Kőszegi,et al.  Regular prices and sales: Regular prices and sales , 2014 .

[22]  Benjamin Van Roy,et al.  Model-based Reinforcement Learning and the Eluder Dimension , 2014, NIPS.

[23]  D. Simchi-Levi,et al.  Online Network Revenue Management Using Thompson Sampling , 2017 .