The Max K-Armed Bandit: A New Model of Exploration Applied to Search Heuristic Selection
暂无分享,去创建一个
[1] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .
[2] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[3] Upendra Dave,et al. Heuristic Scheduling Systems , 1993 .
[4] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Klaus Neumann,et al. Truncated branch-and-bound, schedule-construction, and schedule-improvement procedures for resource-constrained project scheduling , 2001, OR Spectr..
[7] Stephen F. Smith,et al. A Constraint-Based Method for Project Scheduling with Time Windows , 2002, J. Heuristics.
[8] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[9] Chris N. Potts,et al. An Iterated Dynasearch Algorithm for the Single-Machine Total Weighted Tardiness Scheduling Problem , 2002, INFORMS J. Comput..
[10] Eric P. Smith,et al. An Introduction to Statistical Modeling of Extreme Values , 2002, Technometrics.
[11] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[12] Tristan B. Smith,et al. An Effective Algorithm for Project Scheduling with Arbitrary Temporal Constraints , 2004, AAAI.
[13] Stephen F. Smith,et al. Heuristic Selection for Stochastic Search Optimization: Modeling Solution Quality by Extreme Value Theory , 2004, CP.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Stephen F. Smith,et al. Enhancing Stochastic Search Performance by Value-Biased Randomization of Heuristics , 2005, J. Heuristics.