Estimation-Action-Reflection: Towards Deep Interaction Between Conversational and Recommender Systems

Recommender systems are embracing conversational technologies to obtain user preferences dynamically, and to overcome inherent limitations of their static models. A successful Conversational Recommender System (CRS) requires proper handling of interactions between conversation and recommendation. We argue that three fundamental problems need to be solved: 1) what questions to ask regarding item attributes, 2) when to recommend items, and 3) how to adapt to the users' online feedback. To the best of our knowledge, there lacks a unified framework that addresses these problems. In this work, we fill this missing interaction framework gap by proposing a new CRS framework named Estimation"Action" Reflection, or EAR, which consists of three stages to better converse with users. (1) Estimation, which builds predictive models to estimate user preference on both items and item attributes; (2) Action, which learns a dialogue policy to determine whether to ask attributes or recommend items, based on Estimation stage and conversation history; and (3) Reflection, which updates the recommender model when a user rejects the recommendations made by the Action stage. We present two conversation scenarios on binary and enumerated questions, and conduct extensive experiments on two datasets from Yelp and LastFM, for each scenario, respectively. Our experiments demonstrate significant improvements over the state-of-the-art method CRM [32], corresponding to fewer conversation turns and a higher level of recommendation hits.

[1]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[2]  Xiangnan He,et al.  A Generic Coordinate Descent Framework for Learning from Implicit Feedback , 2016, WWW.

[3]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[4]  Konstantina Christakopoulou,et al.  Q&R: A Two-Stage Approach toward Interactive Recommendation , 2018, KDD.

[5]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[6]  Shuai Li,et al.  Online Clustering of Bandits , 2014, ICML.

[7]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[8]  Xu Chen,et al.  Towards Conversational Search and Recommendation: System Ask, User Respond , 2018, CIKM.

[9]  Doina Precup,et al.  Algorithms for multi-armed bandit problems , 2014, ArXiv.

[10]  Mohan S. Kankanhalli,et al.  MMALFM , 2018, ACM Trans. Inf. Syst..

[11]  Bin Shen,et al.  Collaborative Memory Network for Recommendation Systems , 2018, SIGIR.

[12]  Zhaochun Ren,et al.  Hierarchical Variational Memory Network for Dialogue Generation , 2018, WWW.

[13]  Hang Li,et al.  Toward Building Conversational Recommender Systems: A Contextual Bandit Approach , 2019, ArXiv.

[14]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[15]  Jianfeng Gao,et al.  Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.

[16]  Tat-Seng Chua,et al.  Fast Matrix Factorization for Online Recommendation with Implicit Feedback , 2016, SIGIR.

[17]  Christopher Joseph Pal,et al.  Towards Deep Conversational Recommendations , 2018, NeurIPS.

[18]  Fumin Shen,et al.  Chat More: Deepening and Widening the Chatting Topic via A Deep Model , 2018, SIGIR.

[19]  Giuseppe Sansonetti,et al.  An Approach to Conversational Recommendation of Restaurants , 2019, HCI.

[20]  Elena Karahanna,et al.  Online Recommendation Systems in a B2C E-Commerce Context: A Review and Future Directions , 2015, J. Assoc. Inf. Syst..

[21]  Hongxia Jin,et al.  A Visual Dialog Augmented Interactive Recommender System , 2019, KDD.

[22]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[23]  Filip Radlinski,et al.  Towards Conversational Recommender Systems , 2016, KDD.

[24]  Yi Zhang,et al.  Conversational Recommender System , 2018, SIGIR.

[25]  Min-Yen Kan,et al.  Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures , 2018, ACL.

[26]  Quanquan Gu,et al.  Contextual Bandits in a Collaborative Environment , 2016, SIGIR.

[27]  Tat-Seng Chua,et al.  Deep Conversational Recommender in Travel , 2019, ArXiv.

[28]  Zhaochun Ren,et al.  Explicit State Tracking with Semi-Supervisionfor Neural Dialogue Generation , 2018, CIKM.

[29]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[30]  Xiangnan He,et al.  Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention , 2017, SIGIR.

[31]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[32]  M S Ayundhita,et al.  Ontology-based conversational recommender system for recommending laptop , 2019 .

[33]  Mohan S. Kankanhalli,et al.  Aspect-Aware Latent Factor Model: Rating Prediction with Ratings and Reviews , 2018, WWW.

[34]  Qingyun Wu,et al.  Learning Contextual Bandits in a Non-stationary Environment , 2018, SIGIR.

[35]  Tat-Seng Chua,et al.  Knowledge-aware Multimodal Dialogue Systems , 2018, ACM Multimedia.

[36]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[37]  Tat-Seng Chua,et al.  Neural Factorization Machines for Sparse Predictive Analytics , 2017, SIGIR.

[38]  Bilih Priyogi,et al.  Preference Elicitation Strategy for Conversational Recommender System , 2019, WSDM.

[39]  Shuai Li,et al.  Collaborative Filtering Bandits , 2015, SIGIR.

[40]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[41]  Huazheng Wang,et al.  Factorization Bandits for Interactive Recommendation , 2017, AAAI.