The Budgeted Multi-armed Bandit Problem
暂无分享,去创建一个
The following coins problem is a version of a multi-armed bandit problem where one has to select from among a set of objects, say classifiers, after an experimentation phase that is constrained by a time or cost budget. The question is how to spend the budget. The problem involves pure exploration only, differentiating it from typical multi-armed bandit problems involving an exploration/exploitation tradeoff [BF85]. It is an abstraction of the following scenarios: choosing from among a set of alternative treatments after a fixed number of clinical trials, determining the best parameter settings for a program given a deadline that only allows a fixed number of runs; or choosing a life partner in the bachelor/bachelorette TV show where time is limited. We are interested in the computational complexity of the coins problem and/or efficient algorithms with approximation guarantees.
[1] Russell Greiner,et al. Budgeted Learning of Naive-Bayes Classifiers , 2003, UAI.
[2] Russell Greiner,et al. Budgeted learning of nailve-bayes classifiers , 2002, UAI 2002.
[3] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .