论文信息 - Branching bandits and Klimov's problem: achievable region and side constraints

Branching bandits and Klimov's problem: achievable region and side constraints

Considers the average cost branching bandits problem and its special case known as Klimov's problem. The authors consider the vector n whose components are the mean number of bandits (or customers) of each type that are present. The authors characterize fully the achievable region, that is, the set of all possible vectors n that can be obtained by considering all possible policies. While the original description of the achievable region involves exponentially many constraints, the authors also develop an alternative description that involves only O(R/sup 2/) variables and constraints, where R is the number of bandit types (or customer classes). The authors then consider the problem of minimizing a linear function of n subject to L additional linear constraints on n. The authors show that optimal policies can be obtained by randomizing between L+1 strict priority policies that can be found efficiently (in polynomial time) using linear programming techniques.<<ETX>>

J. Tsitsiklis | D. Bertsimas | I. Paschalidis

[1] G. Klimov. Time-Sharing Service Systems. I , 1975 .

[2] Erol Gelenbe,et al. Analysis and Synthesis of Computer Systems , 1980 .

[3] B. Hajek. Hitting-time and occupation-time bounds implied by drift analysis with applications , 1982, Advances in Applied Probability.

[4] Keith W. Ross,et al. Optimal priority assignment with hard constraint , 1986 .

[5] D. Tompsett. Conservation laws , 1987 .

[6] J. Ben Atkinson,et al. An Introduction to Queueing Networks , 1988 .

[7] Gideon Weiss,et al. Branching Bandit Processes , 1988, Probability in the Engineering and Informational Sciences.

[8] Awi Federgruen,et al. Characterization and Optimization of Achievable Performance in General Queueing Systems , 1988, Oper. Res..

[9] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .

[10] J. Walrand,et al. Interchange arguments in stochastic scheduling , 1989 .

[11] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .

[12] John N. Tsitsiklis,et al. Optimization of multiclass queuing networks: polyhedral and nonlinear characterizations of achievable performance , 1994 .

[13] Leonidas Georgiadis,et al. Extended Polymatroids: Properties and Optimization , 1992, Conference on Integer Programming and Combinatorial Optimization.

[14] David D. Yao,et al. Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control , 1992, Oper. Res..

[15] Armand M. Makowski,et al. On constrained optimization of the Klimov network and related Markov decision processes , 1993, IEEE Trans. Autom. Control..

[16] Ioannis Ch. Paschalidis. Scheduling of multiclass queueing networks : bounds on achievable performance , 1993 .

[17] S. Sushanth Kumar,et al. Performance bounds for queueing networks and scheduling policies , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[18] P. R. Kumar,et al. Performance bounds for queueing networks and scheduling policies , 1994, IEEE Trans. Autom. Control..