Metropolis Criterion Based Q-Learning Flow Control for High-Speed Networks

Abstract For the congestion problems in high-speed networks, a Metropolis criterion based Q-learning flow controller is proposed. Because of the uncertainties and highly time-varying, it is not easy to accurately obtain the complete information for high-speed networks. The Q-learning algorithm, which is independent of mathematic model, shows the particular superiority in high-speed networks. It obtains the optimal Q-values through interaction with the environment to improve its behavior policy. The Metropolis criterion of simulated annealing algorithm can cope with the balance between exploration and exploitation in Q-learning. By means of learning procedures, the proposed controller can learn to take the best action to regulate source flow with the features of high throughput and low packet loss ratio. Simulation results show that the proposed method can promote the performance of the networks and avoid the occurrence of congestion effectively.

[1]  Jon Crowcroft,et al.  Congestion control mechanisms and the best effort service model , 2001, IEEE Netw..

[2]  Chung-Ju Chang,et al.  A QoS-Provisioning neural fuzzy connection admission controller for multimedia high-speed networks , 1999, TNET.

[3]  Günhan Dündar,et al.  Hierarchical neuro-fuzzy call admission controller for ATM networks , 2001, Comput. Commun..

[4]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[5]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[8]  Yang Liu,et al.  A new Q-learning algorithm based on the metropolis criterion , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[10]  Kao-Shing Hwang,et al.  A REINFORCEMENT LEARNING APPROACH TO CONGESTION CONTROL OF HIGH-SPEED MULTIMEDIA NETWORKS , 2005, Cybern. Syst..

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  Andreas Pitsillides,et al.  Adaptive congestion protocol: A congestion control protocol with learning capability , 2007, Comput. Networks.

[13]  Kao-Shing Hwang,et al.  Reinforcement learning congestion controller for multimedia surveillance system , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).