论文信息 - Deep Reinforcement Learning for Personalized Search Story Recommendation - 字舞流文

Deep Reinforcement Learning for Personalized Search Story Recommendation

In recent years, \emph{search story}, a combined display with other organic channels, has become a major source of user traffic on platforms such as e-commerce search platforms, news feed platforms and web and image search platforms. The recommended search story guides a user to identify her own preference and personal intent, which subsequently influences the user's real-time and long-term search behavior. %With such an increased importance of search stories, As search stories become increasingly important, in this work, we study the problem of personalized search story recommendation within a search engine, which aims to suggest a search story relevant to both a search keyword and an individual user's interest. To address the challenge of modeling both immediate and future values of recommended search stories (i.e., cross-channel effect), for which conventional supervised learning framework is not applicable, we resort to a Markov decision process and propose a deep reinforcement learning architecture trained by both imitation learning and reinforcement learning. We empirically demonstrate the effectiveness of our proposed approach through extensive experiments on real-world data sets from this http URL.

Dongwon Lee | Jason Zhang | Linhong Zhu | Junming Yin | Junming Yin | Dongwon Lee | Linhong Zhu | Jason Zhang

[1] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.

[2] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[3] Alexander J. Smola,et al. Maximum Margin Matrix Factorization for Collaborative Ranking , 2007 .

[4] Sergey Levine,et al. Offline policy evaluation across representations with applications to educational games , 2014, AAMAS.

[5] Zhiyuan Xu,et al. Model-free Control for Distributed Stream Data Processing using Deep Reinforcement Learning , 2018, Proc. VLDB Endow..

[6] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[7] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[8] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[9] Alexandros Karatzoglou,et al. Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.

[10] Yiwei Zhang,et al. Reinforcement Mechanism Design for e-commerce , 2017, WWW.

[11] Alexandros Karatzoglou,et al. Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations , 2016, RecSys.

[12] Jiaxing Song,et al. Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems , 2019, KDD.

[13] Paul Covington,et al. Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[14] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[15] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[16] Yehuda Koren,et al. Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[17] Jun Wang,et al. Interactive collaborative filtering , 2013, CIKM.

[18] Patrick Seemann,et al. Matrix Factorization Techniques for Recommender Systems , 2014 .

[19] John Riedl,et al. Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[20] Anil A. Bharath,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[21] Immanuel Trummer,et al. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning , 2018, Proc. VLDB Endow..

[22] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[23] Yong Liu,et al. Improved Recurrent Neural Networks for Session-based Recommendations , 2016, DLRS@RecSys.

[24] Ashutosh Saxena,et al. High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.

[25] Ke Zhou,et al. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning , 2019, SIGMOD Conference.

[26] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[27] Dit-Yan Yeung,et al. Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[28] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[29] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[30] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[31] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[32] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[33] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[34] Jun Wang,et al. Real-Time Bidding by Reinforcement Learning in Display Advertising , 2017, WSDM.

[35] Philip S. Thomas,et al. Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees , 2015, IJCAI.

[36] Naoki Abe,et al. Cross channel optimized marketing by reinforcement learning , 2004, KDD.

[37] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[38] Yujing Hu,et al. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.

[39] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[40] Nicholas Jing Yuan,et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.

[41] Filip Radlinski,et al. Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[42] Benjamin Schrauwen,et al. Deep content-based music recommendation , 2013, NIPS.

[43] Anne-Marie Kermarrec,et al. Heterogeneous Recommendations: What You Might Like To Read After Watching Interstellar , 2017, Proc. VLDB Endow..

[44] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[45] Yuxi Li,et al. Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[46] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..