Towards 5G: A Reinforcement Learning-Based Scheduling Solution for Data Traffic Management

Dominated by delay-sensitive and massive data applications, radio resource management in 5G access networks is expected to satisfy very stringent delay and packet loss requirements. In this context, the packet scheduler plays a central role by allocating user data packets in the frequency domain at each predefined time interval. Standard scheduling rules are known limited in satisfying higher quality of service (QoS) demands when facing unpredictable network conditions and dynamic traffic circumstances. This paper proposes an innovative scheduling framework able to select different scheduling rules according to instantaneous scheduler states in order to minimize the packet delays and packet drop rates for strict QoS requirements applications. To deal with real-time scheduling, the reinforcement learning (RL) principles are used to map the scheduling rules to each state and to learn when to apply each. Additionally, neural networks are used as function approximation to cope with the RL complexity and very large representations of the scheduler state space. Simulation results demonstrate that the proposed framework outperforms the conventional scheduling strategies in terms of delay and packet drop rate requirements.

[1]  Antonino Masaracchia,et al.  Robust Adaptive Modulation and Coding (AMC) Selection in LTE Systems Using Reinforcement Learning , 2014, 2014 IEEE 80th Vehicular Technology Conference (VTC2014-Fall).

[2]  Ismail Güvenç,et al.  Learning Based Frequency- and Time-Domain Inter-Cell Interference Coordination in HetNets , 2014, IEEE Transactions on Vehicular Technology.

[3]  Dongbin Zhao,et al.  MEC—A Near-Optimal Online Reinforcement Learning Algorithm for Continuous Deterministic Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[4]  K. Sandrasegaran,et al.  Performance analysis of EXP/PF and M-LWDF in downlink 3GPP LTE system , 2009, 2009 First Asian Himalayas International Conference on Internet.

[5]  Sijing Zhang,et al.  Scheduling policies based on dynamic throughput and fairness tradeoff control in LTE-A networks , 2014, 39th Annual IEEE Conference on Local Computer Networks.

[6]  Sudharman K. Jayaweera,et al.  A Survey on Machine-Learning Techniques in Cognitive Radios , 2013, IEEE Communications Surveys & Tutorials.

[7]  Anass Benjebbour,et al.  Design considerations for a 5G network architecture , 2014, IEEE Communications Magazine.

[8]  Valentin Savin,et al.  Backhaul-aware small cell DTX based on fuzzy Q-Learning in heterogeneous cellular networks , 2016, 2016 IEEE International Conference on Communications (ICC).

[9]  Marco Wiering,et al.  Using continuous action spaces to solve discrete problems , 2009, 2009 International Joint Conference on Neural Networks.

[10]  Jianping Chen,et al.  Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning , 2014, 2014 IEEE Global Communications Conference.

[11]  A. Benjebbour,et al.  Design considerations for a 5G network architecture , 2014, IEEE Communications Magazine.

[12]  Ovidiu Iacoboaiea,et al.  SON Coordination in Heterogeneous Networks: A Reinforcement Learning Framework , 2016, IEEE Transactions on Wireless Communications.

[13]  Hado Philip van Hasselt,et al.  Insights in reinforcement rearning : formal analysis and empirical evaluation of temporal-difference learning algorithms , 2011 .

[14]  Jong-Hun Rhee,et al.  Scheduling of real/non-real time services: adaptive EXP/PF algorithm , 2003, The 57th IEEE Semiannual Vehicular Technology Conference, 2003. VTC 2003-Spring..

[15]  Marco Wiering QV(lambda)-learning: A New On-policy Reinforcement Learning Algrithm , 2005 .

[16]  A. Lozano,et al.  What Will 5 G Be ? , 2014 .

[17]  Ioan-Sorin Comsa Sustainable scheduling policies for radio access networks based on LTE technology , 2014 .

[18]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[19]  Karim Djouani,et al.  A Survey of Resource Management Toward 5G Radio Access Networks , 2016, IEEE Communications Surveys & Tutorials.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Ovidiu Iacoboaiea,et al.  Coordinating SON instances: Reinforcement learning with distributed value function , 2014, 2014 IEEE 25th Annual International Symposium on Personal, Indoor, and Mobile Radio Communication (PIMRC).

[22]  Rudy Lauwereins,et al.  Impact of CSI Feedback Strategies on LTE Downlink and Reinforcement Learning Solutions for Optimal Allocation , 2017, IEEE Transactions on Vehicular Technology.

[23]  Geoffrey Ye Li,et al.  Utility-based resource allocation and scheduling in OFDM-based wireless broadband networks , 2005, IEEE Commun. Mag..

[24]  Ashwin Sampath,et al.  Downlink Scheduling for Multiclass Traffic in LTE , 2009, EURASIP J. Wirel. Commun. Netw..

[25]  Bin Liu,et al.  An efficient downlink packet scheduling algorithm for real time traffics in LTE systems , 2013, 2013 IEEE 10th Consumer Communications and Networking Conference (CCNC).

[26]  Jeffrey G. Andrews,et al.  What Will 5G Be? , 2014, IEEE Journal on Selected Areas in Communications.

[27]  Marco Wiering,et al.  The QV family compared to other reinforcement learning algorithms , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[28]  Giuseppe Piro,et al.  Simulating LTE Cellular Systems: An Open-Source Framework , 2011, IEEE Transactions on Vehicular Technology.