Approximate dynamic programming techniques for the control of time-varying queuing systems applied to call centers with abandonments and retrials

In this article we develop techniques for applying Approximate Dynamic Programming (ADP) to the control of time-varying queuing systems. First, we show that the classical state space representation in queuing systems leads to approximations that can be significantly improved by increasing the dimensionality of the state space by state disaggregation. Second, we deal with time-varying parameters by adding them to the state space with an ADP parameterization. We demonstrate these techniques for the optimal admission control in a retrial queue with abandonments and time-varying parameters. The numerical experiments show that our techniques have near to optimal performance.

[1]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Sandjai Bhulai,et al.  On the structure of value functions for threshold policies in queueing models , 2003, Journal of Applied Probability.

[4]  Armann Ingolfsson,et al.  Accounting for time-varying queueing effects in workforce scheduling , 2002, Eur. J. Oper. Res..

[5]  Winfried K. Grassmann Transient solutions in markovian queueing systems , 1977, Comput. Oper. Res..

[6]  Benjamin Van Roy,et al.  On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming , 2004, Math. Oper. Res..

[7]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[8]  Benjamin Van Roy,et al.  Approximate Linear Programming for Average-Cost Dynamic Programming , 2002, NIPS.

[9]  B. Krogh,et al.  State aggregation in Markov decision processes , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[10]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[11]  John N. Tsitsiklis,et al.  Feature-based methods for large scale dynamic programming , 2004, Machine Learning.

[12]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[13]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[14]  Benjamin Van Roy,et al.  The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..

[15]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[16]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[17]  P. Kolesar,et al.  The Pointwise Stationary Approximation for Queues with Nonstationary Arrivals , 1991 .

[18]  Dimitri P. Bertsekas,et al.  Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.

[19]  Michael I. Jordan,et al.  Reinforcement Learning with Soft State Aggregation , 1994, NIPS.

[20]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[21]  Xudong Wu,et al.  A Survey and Experimental Comparison of Service-Level-Approximation Methods for Nonstationary M(t)/M/s(t) Queueing Systems with Exhaustive Discipline , 2007, INFORMS J. Comput..

[22]  Ronald E. Parr,et al.  Hierarchical control and learning for markov decision processes , 1998 .