An Asymptotically Optimal Algorithm for the Max k-Armed Bandit Problem
暂无分享,去创建一个
[1] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[2] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[3] Philip W. L. Fong. A Quantitative Study of Hypothesis Selection , 1995, ICML.
[4] Eric P. Smith,et al. An Introduction to Statistical Modeling of Extreme Values , 2002, Technometrics.
[5] Stephen F. Smith,et al. Heuristic Selection for Stochastic Search Optimization: Modeling Solution Quality by Extreme Value Theory , 2004, CP.
[6] Stephen F. Smith,et al. The Max K-Armed Bandit: A New Model of Exploration Applied to Search Heuristic Selection , 2005, AAAI.
[7] H. Robbins. Some aspects of the sequential design of experiments , 1952 .