A modern Bayesian look at the multi-armed bandit
暂无分享,去创建一个
A multi-armed bandit is an experiment with the goal of accumulating rewards from a payoff distribution with unknown parameters that are to be learned sequentially. This article describes a heuristi...