Choosing a Good Toolkit, II: Simulations and Conclusions
暂无分享,去创建一个
This paper continues our study of heuristics employed to choose dynamically tools to put in a toolkit, where the value of any tool can be discovered only by choosing it. This is a multi-armed bandit problem with “arms” that are not independent, hence it is a problem for which the optimal solution is virtually impossible to find. We study the relative performance of a number of heuristic decision rules for discount factors from 0.7 to 0.99 via simulation in the context of a small number of test problems, learning the strengths and weaknesses of the various heuristics.
[1] P. Todd,et al. Simple Heuristics That Make Us Smart , 1999 .