论文信息 - Reinforcement Learning for User Intent Prediction in Customer Service Bots

Reinforcement Learning for User Intent Prediction in Customer Service Bots

A customer service bot is now a necessary component of an e-commerce platform. As a core module of the customer service bot, user intent prediction can help predict user questions before they ask. A typical solution is to find top candidate questions that a user will be interested in. Such solution ignores the inter-relationship between questions and often aims to maximize the immediate reward such as clicks, which may not be ideal in practice. Hence, we propose to view the problem as a sequential decision making process to better capture the long-term effects of each recommendation in the list. Intuitively, we formulate the problem as a Markov decision process and consider using reinforcement learning for the problem. With this approach, questions presented to users are both relevant and diverse. Experiments on offline real-world dataset and online system demonstrate the effectiveness of our proposed approach.

[1] Steffen Rendle,et al. Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[2] Heng-Tze Cheng,et al. Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[3] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[4] Tie-Yan Liu,et al. Learning to rank for information retrieval , 2009, SIGIR.

[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[6] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[7] Gang Fu,et al. Deep & Cross Network for Ad Click Predictions , 2017, ADKDD@KDD.

[8] George Karypis,et al. Item-based top-N recommendation algorithms , 2004, TOIS.

[9] Wei Chu,et al. AliMe Assist: An Intelligent Assistant for Creating an Innovative E-commerce Experience , 2017, CIKM.

[10] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[11] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[12] Wei Zeng,et al. From Greedy Selection to Exploratory Decision-Making: Diverse Ranking with Policy-Value Networks , 2018, SIGIR.

[13] Liang Zhang,et al. Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.

[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15] Yunming Ye,et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[16] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[18] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.

[19] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[20] Jiafeng Guo,et al. Reinforcement Learning to Rank with Markov Decision Process , 2017, SIGIR.