Learning to Rank an Assortment of Products

We consider the product ranking challenge that online retailers face when their customers typically behave as "window shoppers": they form an impression of the assortment after browsing products ranked in the initial positions and then decide whether to continue browsing. We design online learning algorithms for product ranking that maximize the number of customers who engage with the site. Customers' product preferences and attention spans are correlated and unknown to the retailer; furthermore, the retailer cannot exploit similarities across products owing to the presence of subjective, stylistic elements and the fact that products may not be substitutes. We develop a class of online learning-then-earning algorithms that prescribe a ranking to offer each customer, learning from preceding customers' clickstream data to offer better rankings to subsequent customers. Our algorithms balance product popularity with diversity: the notion of appealing to a large variety of heterogeneous customers. We prove that our learning algorithms converge to a ranking that matches the best-known approximation factors for the offline, complete information setting. Finally, we partner with Wayfair - a multi-billion dollar home goods online retailer - to estimate the impact of our algorithms in practice via simulations using actual clickstream data, and we find that our algorithms yield a significant increase (5-30%) in the number of customers that engage with the site.

[1]  Zheng Wen,et al.  Combinatorial Cascading Bandits , 2015, NIPS.

[2]  Guang Li,et al.  The d-Level Nested Logit Model: Assortment and Price Optimization Problems , 2015, Oper. Res..

[3]  Yuxin Chen,et al.  Sequential Search with Refinement: Model and Application with Click-Stream Data , 2017, Manag. Sci..

[4]  Minjeong Kim,et al.  Cues on apparel web sites that trigger impulse purchases , 2010 .

[5]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[6]  Heng Zhang,et al.  Position Ranking and Auctions for Online Marketplaces , 2017, Manag. Sci..

[7]  Danny Segev,et al.  Display Optimization for Vertically Differentiated Locations Under Multinomial Logit Preferences , 2015, Manag. Sci..

[8]  Vahab Mirrokni,et al.  Two-stage Pandora ’ s Box for Product Ranking , 2018 .

[9]  J. Hauser,et al.  Recommending Products When Consumers Learn Their Preferences , 2014 .

[10]  Beibei Li,et al.  Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowd-Sourced Content , 2011, Mark. Sci..

[11]  Wendy W. Moe,et al.  The Influence of Goal‐Directed and Experiential Activities on Online Flow Experiences , 2003 .

[12]  Sergei Koulayev,et al.  Search for Differentiated Products: Identification and Estimation , 2014 .

[13]  Olivier Cappé,et al.  Multiple-Play Bandits in the Position-Based Model , 2016, NIPS.

[14]  Raluca M. Ursu The Power of Rankings: Quantifying the Effect of Rankings on Online Consumer Search and Purchase Decisions , 2018, Mark. Sci..

[15]  Vashist Avadhanula,et al.  A Near-Optimal Exploration-Exploitation Approach for Assortment Selection , 2016, EC.

[16]  Bart J. Bronnenberg,et al.  The Probit Choice Model under Sequential Search with an Application to Online Retailing , 2016, Manag. Sci..

[17]  Robert D. Nowak,et al.  Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).

[18]  Van-Anh Truong,et al.  Approximation Algorithms for Product Framing and Pricing , 2018, Oper. Res..

[19]  Csaba Szepesvári,et al.  Online Learning to Rank in Stochastic Click Models , 2017, ICML.

[20]  J. Rowley ‘Window’ shopping and browsing opportunities in cyberspace , 2002 .

[21]  Beibei Li,et al.  Examining the Impact of Ranking on Consumer Behavior and Search Engine Revenue , 2013, Manag. Sci..

[22]  M. Weitzman Optimal search for the best alternative , 1978 .

[23]  Youn-Kyung Kim,et al.  The Effects of Website Designs, Self-Congruity, and Flow on Behavioral Intention , 2012 .

[24]  Fernando Bernstein,et al.  A Dynamic Clustering Approach to Data-Driven Assortment Personalization , 2018, Manag. Sci..

[25]  Bart J. Bronnenberg,et al.  Online Demand Under Limited Consumer Search , 2009, Mark. Sci..

[26]  Daniel L. Sherrell,et al.  Extending the concept of shopping: An investigation of browsing activity , 1989 .

[27]  Yajun Wang,et al.  Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms , 2014, J. Mach. Learn. Res..

[28]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[29]  Shie Mannor,et al.  Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..

[30]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[31]  Ambuj Tewari,et al.  PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.

[32]  Thorsten Joachims,et al.  Online learning to diversify from implicit feedback , 2012, KDD.

[33]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[34]  Fernando Bernstein,et al.  A Dynamic Clustering Approach to Data-Driven Assortment Personalization , 2018 .

[35]  Jan Vondrák,et al.  Fast algorithms for maximizing submodular functions , 2014, SODA.

[36]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[37]  Hamid Nazerzadeh,et al.  Maximizing Stochastic Monotone Submodular Functions , 2009, Manag. Sci..