论文信息 - Bandits with Global Convex Constraints and Objective

Bandits with Global Convex Constraints and Objective

Multiarmed bandit (MAB) is a classic model for capturing the exploration–exploitation trade-off inherent in many sequential decision-making problems. The classic MAB framework, however, only allows...

Nikhil R. Devanur | Shipra Agrawal | Shipra Agrawal