Near-Optimal Regret Bounds for Thompson Sampling
暂无分享,去创建一个
Thompson Sampling (TS) is one of the oldest heuristics for multiarmed bandit problems. It is a randomized algorithm based on Bayesian ideas and has recently generated significant interest after sev...