Continuous-time restless bandit and dynamic scheduling for make-to-stock production

We study the "Whittle relaxation" version of the continuous time, discrete, and continuous state space Restless Bandit problem under the discounted cost criterion. Explicit expressions of Whittle's priority indexes, which generalize the Gittins indexes, are derived. This formalism is then used in the context of flexible make-to-stock production to construct dynamic scheduling rules. These analytical results are finally compared with the optimal numerically derived policy, obtained for a server delivering two product types. It is observed that the Whittle relaxation version of the Restless Bandit model nearly yields optimal dynamic scheduling rules.

[1]  Alʹbert Nikolaevich Shiri︠a︡ev,et al.  Optimal stopping rules , 1977 .

[2]  E. V. Krichagina,et al.  Production Control in a Failure-Prone Manufacturing System: Diffusion Approximation and Asymptotic Optimality , 1993 .

[3]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[4]  B. Grigelionis,et al.  On Stefan’s Problem and Optimal Stopping Rules for Markov Processes , 1966 .

[5]  J. Niño-Mora RESTLESS BANDITS, PARTIAL CONSERVATION LAWS AND INDEXABILITY , 2001 .

[6]  Max-Olivier Hongler,et al.  Optimal Stopping and Gittins' Indices for Piecewise Deterministic Evolution Processes , 2001, Discret. Event Dyn. Syst..

[7]  J. Medhi,et al.  Stochastic Processes , 1982 .

[8]  Lawrence M. Wein,et al.  Scheduling a Make-To-Stock Queue: Index Policies and Hedging Points , 1996, Oper. Res..

[9]  P. Whittle Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.

[10]  Peter Whittle,et al.  Optimal Control: Basics and Beyond , 1996 .

[11]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[12]  Albert Y. Ha Optimal Dynamic Scheduling Policy for a Make-To-Stock Production System , 1997, Oper. Res..

[13]  John N. Tsitsiklis,et al.  The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..

[14]  I. Karatzas Gittins Indices in the Dynamic Allocation Problem for Diffusion Processes , 1984 .

[15]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[16]  Haya Kaspi,et al.  Levy Bandits: Multi-Armed Bandits Driven by Levy Processes , 1995 .

[17]  N. Krylov Controlled Diffusion Processes , 1980 .

[18]  José Niño-Mora,et al.  Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach , 2002, Math. Program..

[19]  Paul H. Zipkin,et al.  Dynamic Scheduling Rules for a Multiproduct Make-to-Stock Queue , 1997, Oper. Res..

[20]  Steven A. Lippman,et al.  Applying a New Device in the Optimization of Exponential Queuing Systems , 1975, Oper. Res..

[21]  Yves Dallery,et al.  Dynamic Scheduling in a Make-to-Stock System: A Partial Characterization of Optimal Policies , 2000, Oper. Res..

[22]  Lawrence M. Wein,et al.  Dynamic Scheduling of a Multiclass Make-to-Stock Queue , 2015, Oper. Res..

[23]  Paul Glasserman Hedging-point production control with multiple failure modes , 1995 .

[24]  Larry A Shepp,et al.  The Russian Option: Reduced Regret , 1993 .