Group recommendations via multi-armed bandits
暂无分享,去创建一个
We study recommendations for persistent groups that repeatedly engage in a joint activity. We approach this as a multi-arm bandit problem. We design a recommendation policy and show it has logarithmic regret. Our analysis also shows that regret depends linearly on d, the size of the underlying persistent group. We evaluate our policy on movie recommendations over the MovieLens and MoviePilot datasets.
[1] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.