Value-aware Recommendation based on Reinforcement Profit Maximization

Existing recommendation algorithms mostly focus on optimizing traditional recommendation measures, such as the accuracy of rating prediction in terms of RMSE or the quality of top-k recommendation lists in terms of precision, recall, MAP, etc. However, an important expectation for commercial recommendation systems is to improve the final revenue/profit of the system. Traditional recommendation targets such as rating prediction and top-k recommendation are not directly related to this goal. In this work, we blend the fundamental concepts in online advertising and micro-economics into personalized recommendation for profit maximization. Specifically, we propose value-aware recommendation based on reinforcement learning, which directly optimizes the economic value of candidate items to generate the recommendation list. In particular, we generalize the basic concept of click conversion rate (CVR) in computational advertising into the conversation rate of an arbitrary user action (XVR) in E-commerce, where the user actions can be clicking, adding to cart, adding to wishlist, etc. In this way, each type of user action is mapped to its monetized economic value. Economic values of different user actions are further integrated as the reward of a ranking list, and reinforcement learning is used to optimize the recommendation list for the maximum total value. Experimental results in both offline benchmarks and online commercial systems verified the improved performance of our framework, in terms of both traditional top-k ranking tasks and the economic profits of the system.

[1]  Ahmad A. Kardan,et al.  A hybrid web recommender system based on Q-learning , 2008, SAC '08.

[2]  Yi Zhang,et al.  VAMS 2017: Workshop on Value-Aware and Multistakeholder Recommendation , 2017, RecSys.

[3]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[4]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[5]  Andrei Z. Broder,et al.  A semantic approach to contextual advertising , 2007, SIGIR.

[6]  Peng Jiang,et al.  Modeling Consumer Buying Decision for Recommendation Based on Multi-Task Deep Learning , 2018, CIKM.

[7]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[8]  Dietmar Jannach,et al.  Price and Profit Awareness in Recommender Systems , 2017, ArXiv.

[9]  Guy Shani,et al.  An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[10]  John Riedl,et al.  An Algorithmic Framework for Performing Collaborative Filtering , 1999, SIGIR Forum.

[11]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[12]  John Riedl,et al.  An algorithmic framework for performing collaborative filtering , 1999, SIGIR '99.

[13]  Liang Zhang,et al.  Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.

[14]  John Riedl,et al.  Collaborative Filtering Recommender Systems , 2011, Found. Trends Hum. Comput. Interact..

[15]  Yiqun Liu,et al.  Economic Recommendation with Surplus Maximization , 2016, WWW.

[16]  Avi Goldfarb,et al.  Online Display Advertising: Targeting and Obtrusiveness , 2011, Mark. Sci..

[17]  Yongfeng Zhang,et al.  Maximizing Marginal Utility per Dollar for Economic Recommendation , 2019, WWW.

[18]  Qi Zhao,et al.  Multi-Product Utility Maximization for Economic Recommendation , 2017, WSDM.

[19]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[20]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[21]  Wentong Li,et al.  Estimating conversion rate in display advertising from past erformance data , 2012, KDD.

[22]  Dit-Yan Yeung,et al.  Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[23]  Liang Zhang,et al.  Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning , 2018, KDD.

[24]  Anindya Ghose,et al.  An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets , 2009, Manag. Sci..

[25]  Han Li,et al.  Optimized Cost per Click in Taobao Display Advertising , 2017, KDD.

[26]  Peng Jiang,et al.  Life-stage Prediction for Product Recommendation in E-commerce , 2015, KDD.

[27]  Michael D. Smith,et al.  Location, Location, Location: An Analysis of Profitability of Position in Online Advertising Markets , 2008 .

[28]  Liang Zhang,et al.  Deep reinforcement learning for page-wise recommendations , 2018, RecSys.

[29]  Xiaoyan Zhu,et al.  Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning , 2018, WWW.

[30]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[31]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[32]  Yong Zheng,et al.  Multi-Stakeholder Recommendation: Applications and Challenges , 2017, ArXiv.

[33]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[34]  Steffen Rendle,et al.  Improving pairwise learning for item recommendation from implicit feedback , 2014, WSDM.

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[36]  Tie-Yan Liu Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..

[37]  Xiao Ma,et al.  Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate , 2018, SIGIR.

[38]  Guy Shani,et al.  A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..

[39]  Yujing Hu,et al.  Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.

[40]  Vasudeva Varma,et al.  Computational Advertising: Techniques for Targeting Relevant Ads , 2014 .

[41]  Jun Tan,et al.  Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation , 2018, KDD.

[42]  Yi Tay,et al.  Deep Learning based Recommender System: A Survey and New Perspectives , 2018 .