论文信息 - Index policies for a class of discounted restless bandits

Index policies for a class of discounted restless bandits

The paper concerns a class of discounted restless bandit problems which possess an indexability property. Conservation laws yield an expression for the reward suboptimality of a general policy. These results are utilised to study the closeness to optimality of an index policy for a special class of simple and natural dual speed restless bandits for which indexability is guaranteed. The strong performance of the index policy is confirmed by a computational study.

K. Glazebrook | J. Niño-Mora | P. Ansell

[1] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .

[2] Jean Walrand,et al. Extensions of the multiarmed bandit problem: The discounted case , 1985 .

[3] P. Whittle. Restless Bandits: Activity Allocation in a Changing World , 1988 .

[4] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .

[5] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .

[6] R. Weber,et al. Addendum to ‘On an index policy for restless bandits' , 1991, Advances in Applied Probability.

[7] Dimitris Bertsimas,et al. Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems , 2011, IPCO.

[8] John N. Tsitsiklis,et al. The complexity of optimal queueing network control , 1994, Proceedings of IEEE 9th Annual Conference on Structure in Complexity Theory.

[9] Lawrence M. Wein,et al. Scheduling a Make-To-Stock Queue: Index Policies and Hedging Points , 1996, Oper. Res..

[10] Dimitris Bertsimas,et al. Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems , 1996, Math. Oper. Res..

[11] Jean-Arcady Meyer,et al. Behaviors Coordination Using Restless Bandits Allocation Indexes , 1998 .

[12] John N. Tsitsiklis,et al. The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..

[13] Kevin D. Glazebrook,et al. Almost optimal policies for stochastic systemswhich almost satisfy conservation laws , 1999, Ann. Oper. Res..

[14] José Niño Mora. Restless Bandits, Partial Conservation Laws and Indexability , 2000 .

[15] K. Glazebrook,et al. Index-based policies for discounted multi-armed bandits on parallel machines , 2000 .

[16] Kevin D. Glazebrook,et al. Parallel Scheduling of Multiclass M/M/m Queues: Approximate and Heavy-Traffic Optimization of Achievable Performance , 2001, Oper. Res..