An Online Adaptive Bandwidth Allocation Optimization Algorithm for Wireless Multimedia Communication Networks

The issue of QoS (quality of service) provisioning for adaptive multimedia in wireless communication networks is considered. A reinforcement learning based online adaptive bandwidth allocation optimization algorithm is proposed. First, an event-driven stochastic switching model is introduced to formulate the adaptive bandwidth allocation problem as a constrained continuous-time Markov decision problem. Then, an online optimization algorithm that combines policy gradient estimation by learning and stochastic approximation is derived. This algorithm can online handle the constrained optimization problem efficiently without explicit knowledge of the underlying system parameters. Moreover, this algorithm does not require the computation of performance potentials or other related quantities (e.g. Q-factors), which is necessary in previous schemes, and therefore saves computational cost significantly. Simulation results demonstrate the effectiveness of the proposed algorithm.

[1]  P. Glynn LIKELIHOOD RATIO GRADIENT ESTIMATION : AN OVERVIEW by , 2022 .

[2]  Junghwan Kim,et al.  Modeling adaptive bandwidth allocation scheme for multi-service wireless cellular networks , 2005, WiMob'2005), IEEE International Conference on Wireless And Mobile Computing, Networking And Communications, 2005..

[3]  E. Chong,et al.  Stochastic optimization of regenerative systems using infinitesimal perturbation analysis , 1994, IEEE Trans. Autom. Control..

[4]  John N. Tsitsiklis,et al.  Simulation-based optimization of Markov reward processes , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[5]  Yan Wang,et al.  Fair bandwidth allocation for multi-class of adaptive multimedia services in wireless/mobile networks , 2001, IEEE VTS 53rd Vehicular Technology Conference, Spring 2001. Proceedings (Cat. No.01CH37202).

[6]  Jiang Ai Bandwidth Adaptation Scheme Using Genetic Algorithm in Wireless/Mobile Networks , 2004 .

[7]  Victor C. M. Leung,et al.  Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning , 2004, BROADNETS.

[8]  Xi-Ren Cao,et al.  Perturbation realization, potentials, and sensitivity analysis of Markov processes , 1997, IEEE Trans. Autom. Control..

[9]  Wu Yue,et al.  A Measurement Based Dynamic Call Admission Control Scheme in Wireless Multimedia Communication Networks , 2005 .

[10]  Kang G. Shin,et al.  Analysis of adaptive bandwidth allocation in wireless networks with multilevel degradable quality of service , 2004, IEEE Transactions on Mobile Computing.

[11]  Hossam S. Hassanein,et al.  Connection-level performance analysis for adaptive bandwidth allocation in multimedia wireless cellular networks , 2004, IEEE International Conference on Performance, Computing, and Communications, 2004.

[12]  Yanghee Choi,et al.  Bandwidth Adaptation Algorithms for Adaptive Multimedia Services in Mobile Cellular Networks , 2002, Wirel. Pers. Commun..

[13]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[14]  Haitao Fang,et al.  Potential-based online policy iteration algorithms for Markov decision processes , 2004, IEEE Trans. Autom. Control..

[15]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[16]  Xi-Ren Cao,et al.  The potential structure of sample paths and performance sensitivities of Markov systems , 2004, IEEE Transactions on Automatic Control.

[17]  Peter W. Glynn,et al.  Likelilood ratio gradient estimation: an overview , 1987, WSC '87.