论文信息 - Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems - 字舞流文

Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems

We show that if performance measures in stochastic and dynamic scheduling problems satisfy generalized conservation laws, then the feasible region of achievable performance is a polyhedron called an extended polymatroid, that generalizes the classical polymatroids introduced by Edmonds. Optimization of a linear objective over an extended polymatroid is solved by an adaptive greedy algorithm, which leads to an optimal solution having an indexability property indexable systems. Under a certain condition the indices possess a stronger decomposition property decomposable systems. The following problems can be analyzed using our theory: multiarmed bandit problems, branching bandits, scheduling of multiclass queues with or without feedback, scheduling of a batch of jobs. Consequences of our results include: 1 a characterization of indexable systems as systems that satisfy generalized conservation laws, 2 a sufficient condition for indexable systems to be decomposable, 3 a new linear programming proof of the decomposability property of Gittins indices in multiarmed bandit problems, 4 an approach to sensitivity analysis of indexable systems, 5 a characterization of the indices of indexable systems as sums of dual variables, and an economic interpretation of the branching bandit indices in terms of retirement options, 6 an analysis of the indexability of undiscounted branching bandits, 7 a new algorithm to compute the indices of indexable systems in particular Gittins indices, as fast as the fastest known algorithm, 8 a unification of Klimov's algorithm for multiclass queues and Gittms' algorithm for multiarmed bandits as special cases of the same algorithm, 9 a closed formula for the maximum reward of the multiarmed bandit problem, with a new proof of its submodularity and 10 an understanding of the invariance of the indices with respect to some parameters of the problem. Our approach provides a polyhedral treatment of several classical problems in stochastic and dynamic scheduling and is able to address variations such as: discounted versus undiscounted cost criterion, rewards versus taxes, discrete versus continuous time, and linear versus nonlinear objective functions.

Dimitris Bertsimas | José Niño-Mora | D. Bertsimas | J. Niño-Mora

[1] John B. Shoven,et al. I , Edinburgh Medical and Surgical Journal.

[2] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[3] Alan Cobham,et al. Priority Assignment in Waiting Line Problems , 1954, Oper. Res..

[4] Wayne E. Smith. Various optimizers for single‐stage production , 1956 .

[5] Shaler Stidham,et al. L = λW: A Discounted Analogue and a New Proof , 1972, Oper. Res..

[6] G. Klimov. Time-Sharing Service Systems. I , 1975 .

[7] J. Michael Harrison,et al. A Priority Queue with Discounted Linear Costs , 1975, Oper. Res..

[8] J. Michael Harrison,et al. Dynamic Scheduling of a Multiclass Queue: Discount Optimality , 1975, Oper. Res..

[9] Dong-Wan Tcha,et al. Optimal Control of Single-Server Queuing Networks and Multi-Class M/G/1 Queues with Feedback , 1977, Oper. Res..

[10] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .

[11] Edward G. Coffman,et al. A Characterization of Waiting Time Performance Realizable by Single-Server Queues , 1980, Oper. Res..

[12] P. Whittle. Multi‐Armed Bandits and the Gittins Index , 1980 .

[13] Erol Gelenbe,et al. Analysis and Synthesis of Computer Systems , 1980 .

[14] P. Whittle. Arm-Acquiring Bandits , 1981 .

[15] Peter Whittle,et al. Optimization Over Time , 1982 .

[16] Jean Walrand,et al. Extensions of the multiarmed bandit problem: The discounted case , 1985 .

[17] Michael N. Katehakis,et al. Linear Programming for Finite State Multi-Armed Bandit Problems , 1986, Math. Oper. Res..

[18] J. Tsitsiklis. A lemma on the multiarmed bandit problem , 1986 .

[19] Kevin D. Glazebrook,et al. Sensitivity Analysis for Stochastic Scheduling Problems , 1987, Math. Oper. Res..

[20] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[21] A. Federgruen,et al. M / G / c queueing systems with multiple customer classes: characterization and control of achievable performance under nonpreemptive priority rules , 1988 .

[22] Gideon Weiss,et al. Branching Bandit Processes , 1988, Probability in the Engineering and Informational Sciences.

[23] Awi Federgruen,et al. Characterization and Optimization of Achievable Performance in General Queueing Systems , 1988, Oper. Res..

[24] G. Weiss,et al. Scheduling Stochastic Jobs with a Two-Point Distribution on Two Parallel Machines , 1989, Probability in the Engineering and Informational Sciences.

[25] David D. Yao,et al. Optimal dynamic scheduling in Jackson networks , 1989 .

[26] J. Walrand,et al. Interchange arguments in stochastic scheduling , 1989 .

[27] Eugene L. Lawler,et al. Sequencing and scheduling: algorithms and complexity , 1989 .

[28] R. Weber. On the Gittins Index for Multiarmed Bandits , 1992 .

[29] John N. Tsitsiklis,et al. Optimization of multiclass queuing networks: polyhedral and nonlinear characterizations of achievable performance , 1994 .

[30] Leonidas Georgiadis,et al. Extended Polymatroids: Properties and Optimization , 1992, Conference on Integer Programming and Combinatorial Optimization.

[31] David D. Yao,et al. Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control , 1992, Oper. Res..

[32] J. Tsitsiklis. A short proof of the Gittins index theorem , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[33] Leonidas Georgiadis,et al. Problems of Adaptive Optimization In Multiclass M/GI/1 Queues with Bernoulli Feedback , 1995, Math. Oper. Res..