Value-aware Recommendation based on Reinforced Profit Maximization in E-commerce Systems

Existing recommendation algorithms mostly focus on optimizing traditional recommendation measures, such as the accuracy of rating prediction in terms of RMSE or the quality of top-$k$ recommendation lists in terms of precision, recall, MAP, etc. However, an important expectation for commercial recommendation systems is to improve the final revenue/profit of the system. Traditional recommendation targets such as rating prediction and top-$k$ recommendation are not directly related to this goal. In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. Specifically, we propose value-aware recommendation based on reinforcement learning, which directly optimizes the economic value of candidate items to generate the recommendation list. In particular, we generalize the basic concept of click conversion rate (CVR) in computational advertising into the conversation rate of an arbitrary user action (XVR) in E-commerce, where the user actions can be clicking, adding to cart, adding to wishlist, etc. In this way, each type of user action is mapped to its monetized economic value. Economic values of different user actions are further integrated as the reward of a ranking list, and reinforcement learning is used to optimize the recommendation list for the maximum total value. Experimental results in both offline benchmarks and online commercial systems verified the improved performance of our framework, in terms of both traditional top-$k$ ranking tasks and the economic profits of the system.

[1]  Avi Goldfarb,et al.  Online Display Advertising: Targeting and Obtrusiveness , 2011, Mark. Sci..

[2]  Guy Shani,et al.  An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[3]  Liang Zhang,et al.  Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning , 2018, KDD.

[4]  John Riedl,et al.  An Algorithmic Framework for Performing Collaborative Filtering , 1999, SIGIR Forum.

[5]  Jun Tan,et al.  Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation , 2018, KDD.

[6]  Michael D. Smith,et al.  Location, Location, Location: An Analysis of Profitability of Position in Online Advertising Markets , 2008 .

[7]  Liang Zhang,et al.  Deep reinforcement learning for page-wise recommendations , 2018, RecSys.

[8]  Steffen Rendle,et al.  Improving pairwise learning for item recommendation from implicit feedback , 2014, WSDM.

[9]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Tie-Yan Liu Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[12]  Lina Yao,et al.  Deep Learning Based Recommender System , 2017, ACM Comput. Surv..

[13]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[14]  Xiao Ma,et al.  Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate , 2018, SIGIR.

[15]  Guy Shani,et al.  A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..

[16]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[17]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[18]  Dit-Yan Yeung,et al.  Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[19]  Xiaoyan Zhu,et al.  Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning , 2018, WWW.

[20]  John Riedl,et al.  Collaborative Filtering Recommender Systems , 2011, Found. Trends Hum. Comput. Interact..

[21]  Andrei Z. Broder,et al.  A semantic approach to contextual advertising , 2007, SIGIR.

[22]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[23]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[24]  Yong Zheng,et al.  Multi-Stakeholder Recommendation: Applications and Challenges , 2017, ArXiv.

[25]  Anindya Ghose,et al.  An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets , 2009, Manag. Sci..

[26]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[27]  Peng Jiang,et al.  Modeling Consumer Buying Decision for Recommendation Based on Multi-Task Deep Learning , 2018, CIKM.

[28]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[29]  Liang Zhang,et al.  Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.

[30]  Sekino Masashi Probabilistic Matrix Factorization based on Features , 2010 .

[31]  Ahmad A. Kardan,et al.  A hybrid web recommender system based on Q-learning , 2008, SAC '08.

[32]  Yiqun Liu,et al.  Economic Recommendation with Surplus Maximization , 2016, WWW.

[33]  Yujing Hu,et al.  Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.

[34]  Vasudeva Varma,et al.  Computational Advertising: Techniques for Targeting Relevant Ads , 2014 .

[35]  Qi Zhao,et al.  Multi-Product Utility Maximization for Economic Recommendation , 2017, WSDM.

[36]  Wentong Li,et al.  Estimating conversion rate in display advertising from past erformance data , 2012, KDD.