A REINFORCEMENT LEARNING APPROACH TO CONGESTION CONTROL OF HIGH-SPEED MULTIMEDIA NETWORKS

ABSTRACT A reinforcement learning scheme on congestion control in a high-speed network is presented. Traditional methods for congestion control always monitor the queue length, on which the source rate depends. However, the determination of the congested threshold and sending rate is difficult to couple with each other in these methods. We proposed a simple and robust reinforcement learning congestion controller (RLCC) to solve the problem. The scheme consists of two subsystems: the expectation-return predictor is a long-term policy evaluator and the other is a short-term rate selector, which is composed of action-value evaluator and stochastic action selector elements. RLCC receives reinforcement signals generated by an immediate reward evaluator and takes the best action to control source flow in consideration of high throughput and low cell loss rate. Through on-line learning processes, RLCC can adaptively take more and more correct actions under time-varying environments. Simulation results have shown that the proposed approach can increase system utilization and decrease packet losses simultaneously in comparison with the popular best-effort scheme.

[1]  Ibrahim Habib,et al.  Reinforcement learning-based neural network congestion controller for ATM networks , 1995, Proceedings of MILCOM '95.

[2]  Jon Crowcroft,et al.  Congestion control mechanisms and the best effort service model , 2001, IEEE Netw..

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Sammy Chan,et al.  A congestion control framework for available bit rate service in ATM networks , 2002, Int. J. Commun. Syst..

[5]  Sammy Chan,et al.  Fair packet discarding for controlling ABR traffic in ATM networks , 1997, IEEE Trans. Commun..

[6]  Y. H. Long,et al.  An enhanced explicit rate algorithm for ABR traffic control in ATM networks , 2001, Int. J. Commun. Syst..

[7]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[8]  Shie-Jue Lee,et al.  A neural-fuzzy system for congestion control in ATM networks , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[9]  Chung-Ju Chang,et al.  A QoS-Provisioning neural fuzzy connection admission controller for multimedia high-speed networks , 1999, TNET.

[10]  Jean C. Walrand,et al.  Explicit rate flow control for ABR services in ATM networks , 2000, TNET.

[11]  F. Bonomi,et al.  The rate-based flow control framework for the available bit rate ATM service , 1995, IEEE Netw..

[12]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[13]  Larry L. Peterson,et al.  TCP Vegas: End to End Congestion Avoidance on a Global Internet , 1995, IEEE J. Sel. Areas Commun..