Optimal Dynamic Assortment Planning with Demand Learning

We study a family of stylized assortment planning problems, where arriving customers make purchase decisions among offered products based on maximizing their utility. Given limited display capacity and no a priori information on consumers' utility, the retailer must select which subset of products to offer. By offering different assortments and observing the resulting purchase behavior, the retailer learns about consumer preferences, but this experimentation should be balanced with the goal of maximizing revenues. We develop a family of dynamic policies that judiciously balance the aforementioned trade-off between exploration and exploitation, and prove that their performance cannot be improved upon in a precise mathematical sense. One salient feature of these policies is that they “quickly” recognize, and hence limit experimentation on, strictly suboptimal products.

[1]  G. Ryzin,et al.  Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .

[2]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[3]  Nimrod Megiddo Combinatorial Optimization with Rational Objective Functions , 1979, Math. Oper. Res..

[4]  Ritesh Madan,et al.  The Irrevocable Multiarmed Bandit Problem , 2011, Oper. Res..

[5]  V. Farias,et al.  The Irrevocable Multi-Armed Bandit Problem , 2009 .

[6]  Aydin Alptekinoglu,et al.  Learning Consumer Tastes through Dynamic Assortments , 2012, Oper. Res..

[7]  Marshall L. Fisher,et al.  An Algorithm and Demand Estimation Procedure for Retail Assortment Optimization , 2009 .

[8]  M. Fisher,et al.  Assortment Planning: Review of Literature and Industry Practice , 2008 .

[9]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[10]  André de Palma,et al.  Discrete Choice Theory of Product Differentiation , 1995 .

[11]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[12]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[13]  Felipe Caro,et al.  Dynamic Assortment with Demand Learning for Seasonal Consumer Goods , 2007, Manag. Sci..

[14]  Vishal Gaur,et al.  Assortment Planning and Inventory Decisions Under a Locational Choice Model , 2006, Manag. Sci..

[15]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[16]  Josef Broder,et al.  Dynamic Pricing Under a General Parametric Choice Model , 2012, Oper. Res..

[17]  David B. Shmoys,et al.  Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint , 2010, Oper. Res..

[18]  Andrew E. B. Lim,et al.  Relative Entropy, Exponential Utility, and Robust Dynamic Pricing , 2007, Oper. Res..

[19]  Wallace J. Hopp,et al.  A Static Approximation for Dynamic Demand Substitution with Applications in a Competitive Market , 2008, Oper. Res..

[20]  Victor F. Araman,et al.  Dynamic Pricing for Nonperishable Products with Demand Learning , 2009, Oper. Res..

[21]  Huseyin Topaloglu,et al.  Robust Assortment Optimization in Revenue Management Under the Multinomial Logit Choice Model , 2012, Oper. Res..

[22]  Sridhar Seshadri,et al.  Assortment Planning and Inventory Decisions Under Stockout-Based Substitution , 2009, Oper. Res..

[23]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[24]  Garrett J. van Ryzin,et al.  Stocking Retail Assortments Under Dynamic Consumer Substitution , 2001, Oper. Res..

[25]  G. Ryzin,et al.  On the Relationship Between Inventory Costs and Variety Benefits in Retailassortments , 1999 .

[26]  Benjamin Van Roy,et al.  Dynamic Pricing with a Prior on Market Response , 2010, Oper. Res..

[27]  Carlos F. Daganzo,et al.  Multinomial Probit: The Theory and its Application to Demand Forecasting. , 1980 .