论文信息 - A neuro-dynamic programming approach to call admission control in integrated service networks : the single link case

A neuro-dynamic programming approach to call admission control in integrated service networks : the single link case

We formulate the call admission control problem for a single link in an integrated service environment as a Markov Decision Problem. In principle, an optimal admission control policy can be computed using methods of Dynamic Programming. However, as the number of possible states of the underlying Markov Chain grows exponentially in the number of customer classes, Dynamic Programming algorithms for realistic size problems are computationally infeasible. We try to overcome this so-called "curse of dimensionality" by using methods of Neuro-Dynamic Programming (NDP for short). NDP employs simulation-based algorithms and function approximation techniques to find control policies for large-scale Markov Decision Problems. We apply two methods of NDP to the call admission control problem: the TD(O) algorithm and Approximate Policy Iteration. We assess the performance of these methods by comparing with two heuristic policies: a policy which always accepts a new customer when the required resources are available, and a threshold policy. 1 This research was supported by a contract with Siemens AG, Munich, Germany.

[1] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..

[2] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[3] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[5] Lars Asplund,et al. Neural networks for adaptive traffic control in ATM networks , 1995 .

[6] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD-lambda Network , 1995, NIPS.

[7] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[8] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.