Optimal Scheduling for Piecewise Deterministic Multi-Armed Bandit Problem

We derive explicit expressions for the priority indices (Gittins’ indices), associated with a class of multi-armed Bandit processes. The underlying dynamic governing the arms belong to piecewise deterministic random evolutions. We then use this class of model to discuss a simple version of the scheduling problem of a flexible manufacturing resource, with limited capacity, operating in a random environment.

[1]  Mark H. A. Davis Piecewise‐Deterministic Markov Processes: A General Class of Non‐Diffusion Stochastic Models , 1984 .

[2]  José Niño-Mora On certain greedoid polyhedra, partially indexable scheduling problems and extended restless bandit allocation indices , 2000 .

[3]  Albert Y. Ha Optimal Dynamic Scheduling Policy for a Make-To-Stock Production System , 1997, Oper. Res..

[4]  I. Karatzas,et al.  Dynamic Allocation Problems in Continuous Time , 1994 .

[5]  Max-Olivier Hongler,et al.  Continuous-time restless bandit and dynamic scheduling for make-to-stock production , 2003, IEEE Trans. Robotics Autom..

[6]  Paul H. Zipkin,et al.  Dynamic Scheduling Rules for a Multiproduct Make-to-Stock Queue , 1997, Oper. Res..

[7]  Dimitris Bertsimas,et al.  Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems , 2011, IPCO.

[8]  Stanley B. Gershwin,et al.  Manufacturing Systems Engineering , 1993 .

[9]  I. Karatzas Gittins Indices in the Dynamic Allocation Problem for Diffusion Processes , 1984 .

[10]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[11]  Haya Kaspi,et al.  Levy Bandits: Multi-Armed Bandits Driven by Levy Processes , 1995 .

[12]  J. Menaldi,et al.  On the optimal reward function of the continuous time multiarmed bandit problem , 1990 .

[13]  A. Federgruen,et al.  The Impact of Adding a Make-To-Order Item to a Make-To-Stock Production System , 1999 .

[14]  Lawrence M. Wein,et al.  Scheduling a Make-To-Stock Queue: Index Policies and Hedging Points , 1996, Oper. Res..

[15]  Yves Dallery,et al.  Partial Characterization of Optimal Hedging Point Policies in Unreliable Two-Part-Type Manufacturing Systems , 1998, Oper. Res..

[16]  David D. Yao,et al.  Multiclass Queueing Systems: Polymatroidal Structure and Optimal Scheduling Control , 1992, Oper. Res..

[17]  Max-Olivier Hongler,et al.  Optimal hysteresis for a class of deterministic deteriorating two-armed Bandit problem with switching costs , 2003, Autom..

[18]  Max-Olivier Hongler,et al.  Optimal Stopping and Gittins' Indices for Piecewise Deterministic Evolution Processes , 2001, Discret. Event Dyn. Syst..

[19]  Dimitris Bertsimas,et al.  Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic , 2000, Oper. Res..

[20]  A. Mandelbaum CONTINUOUS MULTI-ARMED BANDITS AND MULTIPARAMETER PROCESSES , 1987 .

[21]  A. Mandelbaum Discrete multi-armed bandits and multi-parameter processes , 1986 .

[22]  Lawrence M. Wein,et al.  Dynamic Scheduling of a Multiclass Make-to-Stock Queue , 2015, Oper. Res..

[23]  W. Eplett Continuous-time allocation indices and their discrete-time approximation , 1986, Advances in Applied Probability.

[24]  Yves Dallery,et al.  Dynamic Scheduling in a Make-to-Stock System: A Partial Characterization of Optimal Policies , 2000, Oper. Res..

[25]  Panganamala Ramana Kumar,et al.  Optimality of Zero-Inventory Policies for Unreliable Manufacturing Systems , 1988, Oper. Res..