论文信息 - Near-Optimal Regret Bounds for Thompson Sampling

Near-Optimal Regret Bounds for Thompson Sampling

Thompson Sampling (TS) is one of the oldest heuristics for multiarmed bandit problems. It is a randomized algorithm based on Bayesian ideas and has recently generated significant interest after sev...

AgrawalShipra | GoyalNavin