Dynamic pricing policies for interdependent perishable products or services using reinforcement learning

Dynamic prices maximize the expected revenue of interdependent products.Reinforcement learning optimizes the pricing of interdependent products.Interdependent pricing enhances learning. Many businesses offer multiple products or services that are interdependent, in which the demand for one is often affected by the prices of others. This article considers a revenue management problem of multiple interdependent products, in which dynamically adjusted over a finite sales horizon to maximize expected revenue, given an initial inventory for each product. The main contribution of this article is to use reinforcement learning to model the optimal pricing of perishable interdependent products when demand is stochastic and its functional form unknown. We show that reinforcement learning can be used to price interdependent products. Moreover, we analyze the performance of the Q-learning with eligibility traces algorithm under different conditions. We illustrate our analysis with the pricing of services.

[1]  Dan Zhang,et al.  Revenue Management for Parallel Flights with Customer-Choice Behavior , 2005, Oper. Res..

[2]  Anton J. Kleywegt,et al.  Models of the Spiral-Down Effect in Revenue Management , 2006, Oper. Res..

[3]  Garrett J. van Ryzin,et al.  A Multiproduct Dynamic Pricing Problem and Its Applications to Network Yield Management , 1997, Oper. Res..

[4]  Susan H. Xu,et al.  Joint Dynamic Pricing of Multiple Perishable Products Under Consumer Choice , 2010, Manag. Sci..

[5]  Constantinos Maglaras,et al.  Dynamic Pricing Strategies for Multi-Product Revenue Management Problems , 2009, Manuf. Serv. Oper. Manag..

[6]  So Young Sohn,et al.  Optimal pricing for mobile manufacturers in competitive market using genetic algorithm , 2009, Expert Syst. Appl..

[7]  Rupal Rana,et al.  Real-time dynamic pricing in a non-stationary environment using model-free reinforcement learning , 2014 .

[8]  Yan Cheng Dynamic Pricing for Multi-Products in E-Retailing , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[9]  Yan Cheng,et al.  Dynamic packaging in e-retailing with stochastic demand over finite horizons: A Q-learning approach , 2009, Expert Syst. Appl..

[10]  Chi-Bin Cheng,et al.  Pricing and promotion strategies of an online shop based on customer segmentation and multiple objective decision making , 2011, Expert Syst. Appl..

[11]  Shingo Mabu,et al.  Adaptability analysis of genetic network programming with reinforcement learning in dynamically changing environments , 2012, Expert Syst. Appl..

[12]  Sean B. Eom A Survey of Operational Expert Systems in Business (1980–1993) , 1996 .

[13]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[14]  Zhaohan Sheng,et al.  Case-based reinforcement learning for dynamic inventory control in a multi-agent supply-chain system , 2009, Expert Syst. Appl..

[15]  Habin Lee,et al.  Agent based mobile negotiation for personalized pricing of last minute theatre tickets , 2012, Expert Syst. Appl..

[16]  K. Talluri,et al.  The Theory and Practice of Revenue Management , 2004 .

[17]  Fernando S. Oliveira,et al.  A Constraint Logic Programming Algorithm for Modeling Dynamic Pricing , 2008, INFORMS J. Comput..

[18]  Bekir Karlik,et al.  An artificial neural networks approach on automobile pricing , 2009, Expert Syst. Appl..

[19]  Oscar Fontenla-Romero,et al.  A comparative study of the scalability of a sensitivity-based learning algorithm for artificial neural networks , 2013, Expert Syst. Appl..

[20]  Pinar Keskinocak,et al.  Dynamic pricing in the presence of inventory considerations: research overview, current practices, and future directions , 2003, IEEE Engineering Management Review.

[21]  Serguei Netessine,et al.  Revenue Management Through Dynamic Cross Selling in E-Commerce Retailing , 2006, Oper. Res..

[22]  Luciano Vieira Lima,et al.  Comparing strategies for modeling students learning styles through reinforcement learning in adaptive and intelligent educational systems: An experimental analysis , 2013, Expert Syst. Appl..

[23]  Jeffrey I. McGill,et al.  Revenue Management: Research Overview and Prospects , 1999, Transp. Sci..

[24]  Chang Ouk Kim,et al.  Case-based myopic reinforcement learning for satisfying target service level in supply chain , 2008, Expert Syst. Appl..

[25]  A DorçaFabiano,et al.  Comparing strategies for modeling students learning styles through reinforcement learning in adaptive and intelligent educational systems , 2013 .

[26]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[27]  Russell C. H. Cheng,et al.  Dynamic pricing of airline tickets with competition , 2008, J. Oper. Res. Soc..

[28]  Y. Narahari,et al.  Learning dynamic prices in electronic retail markets with customer segmentation , 2006, Ann. Oper. Res..

[29]  Tapas K. Das,et al.  A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking , 2002 .

[30]  Adrião Duarte Dória Neto,et al.  Reactive Search strategies using Reinforcement Learning, local search algorithms and Variable Neighborhood Search , 2014, Expert Syst. Appl..

[31]  Russell C. H. Cheng,et al.  Maximizing revenue in the airline industry under one-way pricing , 2004, J. Oper. Res. Soc..

[32]  Sang-Won Kim,et al.  Optimal pricing and production decisions in the presence of symmetrical and asymmetrical substitution , 2011 .

[33]  Shu-Hsien Liao,et al.  Expert system methodologies and applications - a decade review from 1995 to 2004 , 2005, Expert Syst. Appl..

[34]  Kwangyeol Ryu,et al.  Reinforcement learning approach to goal-regulation in a self-evolutionary manufacturing system , 2012, Expert Syst. Appl..

[35]  Fernando S. Oliveira Reinforcement Learning for Business Modeling , 2014 .

[36]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[37]  Ernesto C. Martínez,et al.  SmartGantt - An intelligent system for real time rescheduling based on relational reinforcement learning , 2012, Expert Syst. Appl..

[38]  Abhijit Gosavi,et al.  Reinforcement Learning: A Tutorial Survey and Recent Advances , 2009, INFORMS J. Comput..

[39]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[40]  Tomonobu Senjyu,et al.  Smart pricing scheme: A multi-layered scoring rule application , 2014, Expert Syst. Appl..

[41]  Alper Sen,et al.  A Comparison of Fixed and Dynamic Pricing Policies in Revenue Management , 2013 .

[42]  Wenjiao Zhao,et al.  Optimal Dynamic Pricing for Perishable Assets with Nonhomogeneous Demand , 2000 .

[43]  Ramayya Krishnan,et al.  Dynamic pricing of multiple home delivery options , 2009, Eur. J. Oper. Res..

[44]  Andrew E. B. Lim,et al.  Relative Entropy, Exponential Utility, and Robust Dynamic Pricing , 2007, Oper. Res..

[45]  Gabriel R. Bitran,et al.  An overview of pricing models for revenue management , 2003, IEEE Engineering Management Review.

[46]  Russell C. H. Cheng,et al.  Optimal pricing policies for perishable products , 2005, Eur. J. Oper. Res..

[47]  G. Ryzin,et al.  Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .

[48]  Robert Klein,et al.  Product line pricing for services with capacity constraints and dynamic substitution , 2012, Eur. J. Oper. Res..

[49]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .