Throughput Enhancement via Multi-Armed Bandit in Heterogeneous 5G Networks

In heterogeneous networks, a user equipment (UE) can directly communicate with the macro base station (BS) or a small, low-power pico or femto BS. Alternatively, it can indirectly communicate with the macro BS through one or more intermediate device (UE) or a relay-station that uses the over-the-air backhaul to the macro BS. Due to the highly dynamic and uncertain nature of wireless communication, it is essential for a UE to choose an optimal communication mode and a neighbor to which it connects, e.g., a macro/small BS in the direct communication mode or a nearby relay/device in the indirect communication mode. In this paper, we apply an effective reinforcement learning method, called multi-armed bandit (MAB), to shed light on this problem. Especially, we apply MAB supported by the Thompson sampling theorem to pick an optimal arm—a neighbor that determines the communication mode and resulting performance, while effectively dealing with the exploration-exploitation dilemma in MAB. In a simulation study undertaken in Matlab, we compare the performance of the proposed approach to several baselines representing the current state of the art. Our approach enhances the throughput normalized to the optimal throughput by approximately 8-97% compared to several baselines representing the state of the art. Further, it improves the throughput by up to 15% compared to the best performing baseline [1], [2].

[1]  Setareh Maghsudi,et al.  Multi-armed bandits with application to 5G small cells , 2015, IEEE Wireless Communications.

[2]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[3]  Setareh Maghsudi,et al.  On Transmission Mode Selection in D2D-Enhanced Small Cell Networks , 2017, IEEE Wireless Communications Letters.

[4]  Benjamin Van Roy,et al.  A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..

[5]  Deepak R Dandekar,et al.  Relay Node Placement for Multi-Path Connectivity in Heterogeneous Wireless Sensor Networks , 2012 .

[6]  Husheng Li,et al.  Learning the Spectrum via Collaborative Filtering in Cognitive Radio Networks , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).

[7]  Jinho Choi,et al.  Beamforming for Dual-Hop MIMO AF Relay Networks With Channel Estimation Error and Feedback Delay , 2017, IEEE Access.

[8]  Shahid Mumtaz,et al.  Smart heterogeneous networks: a 5G paradigm , 2017, Telecommunication Systems.

[9]  Setareh Maghsudi,et al.  Transmission mode selection for network-assisted device to device communication: A Levy-bandit approach , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Sofie Pollin,et al.  An adaptive channel selection scheme for reliable TSCH-based communication , 2015, 2015 International Symposium on Wireless Communication Systems (ISWCS).

[11]  Yuan Zhou,et al.  Heterogeneous network: An evolutionary path to 5G , 2015, 2015 21st Asia-Pacific Conference on Communications (APCC).

[12]  John Myles White,et al.  Bandit Algorithms for Website Optimization , 2012 .

[13]  Tony Q. S. Quek,et al.  Heterogeneous network throughput with hybrid-duplex systems , 2014, 2014 IEEE Global Communications Conference.

[14]  Ivan Seskar,et al.  HetNetwork Coding: Scaling Throughput in Heterogeneous Networks using Multiple Radio Interfaces , 2014, ArXiv.

[15]  Mikio Hasegawa,et al.  Application of multi-armed bandit algorithms for channel sensing in cognitive radio , 2012, 2012 IEEE Asia Pacific Conference on Circuits and Systems.

[16]  Ashok K. Agrawala,et al.  Thompson Sampling for Dynamic Multi-armed Bandits , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[17]  Daniel Benevides da Costa,et al.  Beamforming in Traffic-Aware Two-Way Relay Systems With Channel Estimation Error and Feedback Delay , 2017, IEEE Transactions on Vehicular Technology.

[18]  Hossam S. Hassanein,et al.  Relay Node Deployment Strategies in Heterogeneous Wireless Sensor Networks , 2010, IEEE Transactions on Mobile Computing.

[19]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.