VLC and D2D Heterogeneous Network Optimization: A Reinforcement Learning Approach Based on Equilibrium Problems With Equilibrium Constraints

The radio frequency spectrum crunch has triggered the harnessing of other sources of bandwidth, for which visible light is a promising candidate. Even though visible light communication (VLC) ensures high capacity, coverage is limited. This necessitates the integration of VLC and device-to-device (D2D) technologies into heterogeneous networks. In particular, mobile users which are accessible by the VLC transmitters can relay data to mobile users which are not, by means of D2D communication. However, due to the distributed behaviors of mobile users, determining optimal data transmission routes from VLC transmitters to end mobile devices is a major challenge. In this paper, we propose a reinforcement learning (RL)-based approach to determine multi-hop data transmission routes in an indoor VLC-D2D heterogeneous network. We obtain the rewards for the RL-based method dynamically, by formulating the interactions between the mobile users relaying the data as an equilibrium problem with equilibrium constraints and using alternating direction method of multipliers to solve it. The proposed technique can achieve optimal data transmission routes in a distributed manner. The simulation results demonstrate the effectiveness of the proposed approach, showing that transmission routes with low delays and high capacities can be achieved through the learning algorithm.

[1]  Sudharman K. Jayaweera,et al.  Replicated Q-learning based sub-band selection for wideband spectrum sensing in cognitive radios , 2016, 2016 IEEE/CIC International Conference on Communications in China (ICCC).

[2]  Xin Xu,et al.  Reinforcement learning algorithms with function approximation: Recent advances and applications , 2014, Inf. Sci..

[3]  Sven Leyffer,et al.  Solving multi-leader–common-follower games , 2010, Optim. Methods Softw..

[4]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5]  Zhu Han,et al.  Game Theory in Wireless and Communication Networks: Theory, Models, and Applications , 2011 .

[6]  Csaba Szepesvári,et al.  A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[7]  Oriol Sallent,et al.  A novel joint radio resource management approach with reinforcement learning mechanisms , 2005, PCCC 2005. 24th IEEE International Performance, Computing, and Communications Conference, 2005..

[8]  Theodoros A. Tsiftsis,et al.  Coverage Aspects of Indoor VLC Networks , 2015, Journal of Lightwave Technology.

[9]  Masao Nakagawa,et al.  Fundamental analysis for visible-light communication system using LED lights , 2004, IEEE Transactions on Consumer Electronics.

[10]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[11]  Lajos Hanzo,et al.  Resource Allocation Under Delay-Guarantee Constraints for Heterogeneous Visible-Light and RF Femtocell , 2015, IEEE Transactions on Wireless Communications.

[12]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[13]  Shahid Mumtaz,et al.  Energy Efficient Resource Allocation in D2D-Assisted Heterogeneous Networks with Relays , 2016, IEEE Access.

[14]  Zhenzhen Liu,et al.  RL-MAC: A QoS-Aware Reinforcement Learning based MAC Protocol for Wireless Sensor Networks , 2006, 2006 IEEE International Conference on Networking, Sensing and Control.

[15]  Zhu Han,et al.  Bridge the Gap Between ADMM and Stackelberg Game: Incentive Mechanism Design for Big Data Networks , 2017, IEEE Signal Processing Letters.

[16]  Shahid Mumtaz,et al.  Smart heterogeneous networks: a 5G paradigm , 2017, Telecommunication Systems.

[17]  Xiqi Gao,et al.  Cellular architecture and key technologies for 5G wireless communication networks , 2014, IEEE Communications Magazine.

[18]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[19]  Xiaorong Zhu,et al.  Visible light communications heterogeneous network (VLC-HetNet): new model and protocols for mobile scenario , 2017, Wirel. Networks.

[20]  Sanjeev Jain,et al.  A survey on device-to-device (D2D) communication: Architecture and security issues , 2017, J. Netw. Comput. Appl..

[21]  Zhu Han,et al.  A Hierarchical Game Approach for Visible Light Communication and D2D Heterogeneous Network , 2016, 2016 IEEE Global Communications Conference (GLOBECOM).

[22]  Jiaheng Wang,et al.  Visible light communications in heterogeneous networks: Paving the way for user-centric design , 2015, IEEE Wireless Communications.

[23]  Dusit Niyato,et al.  Random access for machine-to-machine communication in LTE-advanced networks: issues and approaches , 2013, IEEE Communications Magazine.

[24]  Zhu Han,et al.  Game-theoretic resource allocation methods for device-to-device communication , 2014, IEEE Wireless Communications.

[25]  Boris S. Mordukhovich,et al.  Equilibrium problems with equilibrium constraints via multiobjective optimization , 2004, Optim. Methods Softw..

[26]  Wasiu O. Popoola,et al.  Hybrid polymer optical fibre and visible light communication link for in-home network , 2017, 2017 26th Wireless and Optical Communication Conference (WOCC).

[27]  Sudharman K. Jayaweera,et al.  Multi-Agent Reinforcement Learning Based Cognitive Anti-Jamming , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[28]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[29]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2014, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Introduction to Reinforcement Learning 1 What is Reinforcement Learning ? , .

[31]  Efstratios Gavves,et al.  Deep Reinforcement Learning in Pac-man , 2016 .

[32]  Walid Saad,et al.  Offloading in HetNet: A Coordination of Interference Mitigation, User Association, and Resource Allocation , 2017, IEEE Transactions on Mobile Computing.

[33]  Chao Zhang,et al.  Dynamic dwell timer for vertical handover in VLC-WLAN heterogeneous networks , 2017, 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC).

[34]  Nada Y. Philip,et al.  Medical QoS provision based on reinforcement learning in ultrasound streaming over 3.5G wireless systems , 2009, IEEE Journal on Selected Areas in Communications.

[35]  Shlomi Arnon Visible Light Communication , 2015 .

[36]  Kok-Lim Alvin Yau,et al.  Route Selection for Multi-Hop Cognitive Radio Networks Using Reinforcement Learning: An Experimental Study , 2016, IEEE Access.

[37]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[38]  Anna Maria Vegni,et al.  A hybrid Radio Frequency and broadcast Visible Light Communication system , 2011, 2011 IEEE GLOBECOM Workshops (GC Wkshps).

[39]  Xiaohui Ye,et al.  Cognitive Network Management with Reinforcement Learning for Wireless Mesh Networks , 2007, IPOM.

[40]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[41]  Andrea Abrardo,et al.  Network coding schemes for Device-to-Device communications based relaying for cellular coverage extension , 2015, 2015 IEEE 16th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[42]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[43]  Mihaela van der Schaar,et al.  Fast Reinforcement Learning for Energy-Efficient Wireless Communication , 2010, IEEE Transactions on Signal Processing.

[44]  Mohamed M. Abdallah,et al.  Energy Efficient Resource Allocation for Mixed RF/VLC Heterogeneous Wireless Networks , 2016, IEEE Journal on Selected Areas in Communications.

[45]  Marcos D. Katz,et al.  Heterogeneous Software-Defined Networks: Implementation of a Hybrid Radio-Optical Wireless Network , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[46]  A. El Saddik,et al.  Ant Colony-Based Reinforcement Learning Algorithm for Routing in Wireless Sensor Networks , 2007, 2007 IEEE Instrumentation & Measurement Technology Conference IMTC 2007.

[47]  Christian Pohlmann Visible Light Communication , 2010 .

[48]  Joachim Walewski,et al.  Visible Light Communications , 2009 .

[49]  Ali Selamat,et al.  Modeling of route planning system based on Q value-based dynamic programming with multi-agent reinforcement learning algorithms , 2014, Eng. Appl. Artif. Intell..