Revenue Maximization and Learning in Products Ranking

We consider the revenue maximization problem for an online retailer who plans to display a set of products differing in their prices and qualities and rank them in order. The consumers have random attention spans and view the products sequentially before purchasing a ``satisficing'' product or leaving the platform empty-handed when the attention span gets exhausted. Our framework extends the cascade model in two directions: the consumers have random attention spans instead of fixed ones and the firm maximizes revenues instead of clicking probabilities. We show a nested structure of the optimal product ranking as a function of the attention span when the attention span is fixed and design a $1/e$-approximation algorithm accordingly for the random attention spans. When the conditional purchase probabilities are not known and may depend on consumer and product features, we devise an online learning algorithm that achieves $\tilde{\mathcal{O}}(\sqrt{T})$ regret relative to the approximation algorithm, despite of the censoring of information: the attention span of a customer who purchases an item is not observable. Numerical experiments demonstrate the outstanding performance of the approximation and online learning algorithms.

[1]  Victor F. Araman,et al.  Dynamic Pricing for Nonperishable Products with Demand Learning , 2009, Oper. Res..

[2]  Olivier Cappé,et al.  Multiple-Play Bandits in the Position-Based Model , 2016, NIPS.

[3]  Min-hwan Oh,et al.  Thompson Sampling for Multinomial Logit Contextual Bandits , 2019, NeurIPS.

[4]  Eric T. Bradlow,et al.  Does In-Store Marketing Work? Effects of the Number and Position of Shelf Facings on Brand Attention and Evaluation at the Point of Purchase , 2009 .

[5]  Juan José Miranda Bront,et al.  A Column Generation Algorithm for Choice-Based Network Revenue Management , 2008, Oper. Res..

[6]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[7]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[8]  Guillermo Gallego,et al.  Attention, Consideration then Selection Choice Model , 2017 .

[9]  Anindya Ghose,et al.  An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets , 2009, Manag. Sci..

[10]  Huseyin Topaloglu,et al.  Assortment Optimization and Pricing Under the Multinomial Logit Model with Impatient Customers: Sequential Recommendation and Selection , 2021, Oper. Res..

[11]  Zheng Wen,et al.  DCM Bandits: Learning to Rank with Multiple Clicks , 2016, ICML.

[12]  David B. Shmoys,et al.  Dynamic Assortment Optimization with a Multinomial Logit Choice Model and Capacity Constraint , 2010, Oper. Res..

[13]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[14]  Wang Chi Cheung,et al.  Dynamic Pricing and Demand Learning with Limited Price Experimentation , 2017 .

[15]  John Morgan,et al.  Clicks, Discontinuities, and Firm Demand Online , 2006 .

[16]  Danny Segev,et al.  Improved Approximation Schemes for MNL-Driven Sequential Assortment Optimization , 2019, SSRN Electronic Journal.

[17]  Assaf J. Zeevi,et al.  Optimal Dynamic Assortment Planning with Demand Learning , 2013, Manuf. Serv. Oper. Manag..

[18]  Danny Segev,et al.  Display Optimization for Vertically Differentiated Locations Under Multinomial Logit Preferences , 2015, Manag. Sci..

[19]  Ozge Sahin,et al.  The Impact of Consumer Search Cost on Assortment Planning and Pricing , 2014 .

[20]  Michael D. Smith,et al.  Location, Location, Location: An Analysis of Profitability of Position in Online Advertising Markets , 2008 .

[21]  Nan Liu,et al.  Assortment Optimization under the Multinomial Logit Model with Sequential Offerings , 2018 .

[22]  Madeleine Udell,et al.  Dynamic Assortment Personalization in High Dimensions , 2016, Oper. Res..

[23]  Huseyin Topaloglu,et al.  Assortment Optimization Under the Multinomial Logit Model with Sequential Offerings , 2020, INFORMS J. Comput..

[24]  N. B. Keskin,et al.  Personalized Dynamic Pricing with Machine Learning: High Dimensional Features and Heterogeneous Elasticity , 2020 .

[25]  Ali Shameli,et al.  Ranking an Assortment of Products via Sequential Submodular Optimization , 2020, ArXiv.

[26]  Zheng Wen,et al.  Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback , 2016, NIPS.

[27]  Vahab Mirrokni,et al.  Product Ranking on Online Platforms , 2018, EC.

[28]  Pascal Van Hentenryck,et al.  Assortment optimization under a multinomial logit model with position bias and social influence , 2014, 4OR.

[29]  Kris Ferreira,et al.  Learning to Rank an Assortment of Products , 2020, Manag. Sci..

[30]  Josef Broder,et al.  Dynamic Pricing Under a General Parametric Choice Model , 2012, Oper. Res..

[31]  Zheng Wen,et al.  Cascading Bandits: Learning to Rank in the Cascade Model , 2015, ICML.

[32]  Mohammad Mahdian,et al.  A Cascade Model for Externalities in Sponsored Search , 2008, WINE.

[33]  Guillermo Gallego,et al.  A Primal-dual Learning Algorithm for Personalized Dynamic Pricing with an Inventory Constraint , 2018, Math. Oper. Res..

[34]  Xi Chen,et al.  Dynamic Assortment Optimization with Changing Contextual Information , 2018, J. Mach. Learn. Res..

[35]  Xi Chen,et al.  Context‐based dynamic pricing with online clustering , 2019, Production and Operations Management.

[36]  Omar Besbes,et al.  Blind Network Revenue Management , 2011, Oper. Res..

[37]  Izak Duenyas,et al.  Nonparametric Self-Adjusting Control for Joint Learning and Optimization of Multiproduct Pricing with Finite Resource Capacity , 2019, Math. Oper. Res..

[38]  D. Simchi-Levi,et al.  Online Network Revenue Management Using Thompson Sampling , 2017 .

[39]  A. V. den Boer,et al.  Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[40]  Wei Sun,et al.  Sequential Choice Bandits: Learning with Marketing Fatigue , 2019, SSRN Electronic Journal.

[41]  Zheng Wen,et al.  Combinatorial Cascading Bandits , 2015, NIPS.

[42]  Garrett J. van Ryzin,et al.  Revenue Management Under a General Discrete Choice Model of Consumer Behavior , 2004, Manag. Sci..

[43]  Guillermo Gallego,et al.  Nonparametric Pricing Analytics with Customer Covariates , 2018 .

[44]  Vineet Goyal,et al.  Near-Optimal Algorithms for Capacity Constrained Assortment Optimization , 2014 .

[45]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[46]  Zheng Wen,et al.  Cascading Bandits for Large-Scale Recommendation Problems , 2016, UAI.

[47]  Vincent Y. F. Tan,et al.  A Thompson Sampling Algorithm for Cascading Bandits , 2019, AISTATS.

[48]  A. Zeevi,et al.  Non-Stationary Stochastic Optimization , 2014 .

[49]  Hemant K. Bhargava,et al.  Implementing Sponsored Search in Web Search Engines: Computational Evaluation of Alternative Mechanisms , 2007, INFORMS J. Comput..

[50]  Garrett J. van Ryzin,et al.  Stocking Retail Assortments Under Dynamic Consumer Substitution , 2001, Oper. Res..

[51]  Huseyin Topaloglu,et al.  Assortment Optimization Under Variants of the Nested Logit Model , 2014, Oper. Res..

[52]  Vashist Avadhanula,et al.  MNL-Bandit: A Dynamic Learning Approach to Assortment Selection , 2017, Oper. Res..

[53]  Wei Sun,et al.  Dynamic Learning of Sequential Choice Bandit Problem under Marketing Fatigue , 2019, AAAI.

[54]  Pascal Van Hentenryck,et al.  Assortment Optimization under the Sequential Multinomial Logit Model , 2017, Eur. J. Oper. Res..

[55]  Van-Anh Truong,et al.  Approximation Algorithms for Product Framing and Pricing , 2020, Oper. Res..

[56]  Danny Segev,et al.  Click-Based MNL: Algorithmic Frameworks for Modeling Click Data in Assortment Optimization , 2019, SSRN Electronic Journal.