Clustering of Conversational Bandits for User Preference Learning and Elicitation

Conversational recommender systems elicit user preference via interactive conversational interactions. By introducing conversational key-terms, existing conversational recommenders can effectively reduce the need for extensive exploration in a traditional interactive recommender. However, there are still limitations of existing conversational recommender approaches eliciting user preference via key-terms. First, the key-term data of the items needs to be carefully labeled, which requires a lot of human efforts. Second, the number of the human labeled key-terms is limited and the granularity of the key-terms is fixed, while the elicited user preference is usually from coarse-grained to fine-grained during the conversations. In this paper, we propose a clustering of conversational bandits algorithm. To avoid the human labeling efforts and automatically learn the key-terms with the proper granularity, we online cluster the items and generate meaningful key-terms for the items during the conversational interactions. Our algorithm is general and can also be used in the user clustering when the feedback from multiple users is available, which further leads to more accurate learning and generations of conversational key-terms. We analyze the regret bound of our learning algorithm. In the empirical evaluations, without using any human labeled key-terms, our algorithm effectively generates meaningful coarse-to-fine grained key-terms and performs as well as or better than the state-of-the-art baseline.

[1]  Philip M. Long,et al.  Associative Reinforcement Learning using Linear Probabilistic Concepts , 1999, ICML.

[2]  Shuai Li,et al.  Improved Algorithm on Online Clustering of Bandits , 2019, IJCAI.

[3]  Tsvi Kuflik,et al.  Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011) , 2011, RecSys '11.

[4]  Xiaoying Zhang,et al.  Conversational Contextual Bandit: Algorithm and Application , 2020, WWW.

[5]  Shuai Li,et al.  Collaborative Filtering Bandits , 2015, SIGIR.

[6]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[7]  M. de Rijke,et al.  Advances and Challenges in Conversational Recommender Systems: A Survey , 2021, AI Open.

[8]  Yongfeng Zhang,et al.  CAFE: Coarse-to-Fine Neural Symbolic Reasoning for Explainable Recommendation , 2020, CIKM.

[9]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[10]  Bracha Shapira,et al.  Recommender Systems Handbook , 2015, Springer US.

[11]  Hongxia Yang,et al.  Towards Knowledge-Based Recommender Dialog System , 2019, EMNLP.

[12]  Yuanyuan Jin,et al.  Conversational Music Recommendation based on Bandits , 2020, 2020 IEEE International Conference on Knowledge Graph (ICKG).

[13]  Yongfeng Zhang,et al.  Tutorial on Conversational Recommendation Systems , 2020, RecSys.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Nathan Schneider,et al.  Association for Computational Linguistics: Human Language Technologies , 2011 .

[16]  Yi Zhang,et al.  Conversational Recommender System , 2018, SIGIR.

[17]  Xiangnan He,et al.  Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users , 2020, ArXiv.

[18]  Yisong Yue,et al.  Hierarchical Exploration for Accelerating Contextual Bandits , 2012, ICML.

[19]  Conor Hayes,et al.  SemStim: Exploiting Knowledge Graphs for Cross-Domain Recommendation , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[20]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[21]  Kun Gai,et al.  Learning Tree-based Deep Model for Recommender Systems , 2018, KDD.

[22]  Ji-Rong Wen,et al.  Adapting User Preference to Online Feedback in Multi-round Conversational Recommendation , 2021, WSDM.

[23]  Gao Cong,et al.  Coarse-to-fine review selection via supervised joint aspect and sentiment model , 2014, SIGIR.

[24]  Junjun Zhang,et al.  Tag propagation based recommendation across diverse social media , 2014, WWW.

[25]  Shuai Li,et al.  Fast distributed bandits for online recommendation systems , 2020, ICS.

[26]  Hady Wirawan Lauw,et al.  Dynamic Clustering of Contextual Multi-Armed Bandits , 2014, CIKM.

[27]  Bilih Priyogi,et al.  Preference Elicitation Strategy for Conversational Recommender System , 2019, WSDM.

[28]  Quanquan Gu,et al.  Contextual Bandits in a Collaborative Environment , 2016, SIGIR.

[29]  Jingrui He,et al.  Local Clustering in Contextual Multi-Armed Bandits , 2021, WWW.

[30]  Charu C. Aggarwal,et al.  An Introduction to Recommender Systems , 2016 .

[31]  Konstantina Christakopoulou,et al.  Q&R: A Two-Stage Approach toward Interactive Recommendation , 2018, KDD.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Shuai Li,et al.  Online Clustering of Contextual Cascading Bandits , 2017, AAAI.

[34]  M. de Rijke,et al.  Conversational Recommendation: Formulation, Methods, and Evaluation , 2020, SIGIR.

[35]  Xiangnan He,et al.  Interactive Path Reasoning on Graph for Conversational Recommendation , 2020, KDD.

[36]  Thomas P. Hayes,et al.  Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[37]  Christopher Joseph Pal,et al.  Towards Deep Conversational Recommendations , 2018, NeurIPS.

[38]  Shuai Li,et al.  On Context-Dependent Clustering of Bandits , 2016, ICML.

[39]  Yongfeng Zhang,et al.  COOKIE: A Dataset for Conversational Recommendation over Knowledge Graphs in E-commerce , 2020, ArXiv.

[40]  Shuai Li,et al.  Online Clustering of Bandits , 2014, ICML.

[41]  Shuai Li,et al.  Comparison-based Conversational Recommender System with Relative Bandit Feedback , 2021, SIGIR.

[42]  Xiangjian He,et al.  User relationship strength modeling for friend recommendation on Instagram , 2017, Neurocomputing.

[43]  Filip Radlinski,et al.  Towards Conversational Recommender Systems , 2016, KDD.

[44]  Dietmar Jannach,et al.  Measuring the Business Value of Recommender Systems , 2019, ACM Trans. Manag. Inf. Syst..

[45]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[46]  Ye Bi,et al.  DCDIR: A Deep Cross-Domain Recommendation System for Cold Start Users in Insurance Domain , 2020, SIGIR.

[47]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[48]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[49]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[50]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[51]  Kaitao Song,et al.  Coarse-to-fine: A dual-view attention network for click-through rate prediction , 2021, Knowl. Based Syst..

[52]  Xiang Li,et al.  Joint Optimization of Tree-based Index and Deep Model for Recommender Systems , 2019, NeurIPS.