Resources Sharing in 5G Networks: Learning-Enabled Incentives and Coalitional Games

Smart systems are often battery-constrained, and compete for resources from remote clouds, which results in high delay. Collaboratively sharing resource among neighbors in proximity is promising to control such delay for time-sensitive applications. Rather few existing studies focus on the design between ubiquitous cooperation and competition with learning-enable incentives. In this article, intelligent algorithms are introduced in a distributed fashion, which encapsulates cooperation and competition to coordinate the overall goal of the cellular system with individual goals of Internet of Things (IoT) devices. First, the utility function of the cell and IoT users are designed, respectively. For the former, an incentive mechanism is constructed, where a novel deep actor-critic learning algorithm is developed with a prioritized queue for continuous action space in the differentiated decision-making procedure. For the latter, the energy model is taken into account. Furthermore, the coalition game combined with deep Q-learning framework is explored so as to model and incentivize the cooperation and competition process. Theoretical analysis and simulation studies demonstrate that the improved algorithms perform better than the original version, and they can converge to a Nash-stable optimal or asymptotically optimal solution.

[1]  Basem Shihada,et al.  Energy-Efficient Power Allocation in Multitier 5G Networks Using Enhanced Online Learning , 2017, IEEE Transactions on Vehicular Technology.

[2]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[3]  Yang Gao,et al.  Multiagent Reinforcement Learning With Sparse Interactions by Negotiation and Knowledge Transfer , 2015, IEEE Transactions on Cybernetics.

[4]  Hongbo Jiang,et al.  Resource Allocation for Heterogeneous Applications With Device-to-Device Communication Underlaying Cellular Networks , 2016, IEEE Journal on Selected Areas in Communications.

[5]  Tao Wang,et al.  The Tick Programmable Low-Latency SDR System , 2018, GETMBL.

[6]  Zhu Han,et al.  User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach , 2018, IEEE Transactions on Wireless Communications.

[7]  Weihua Zhuang,et al.  Multi-Resource Coordinate Scheduling for Earth Observation in Space Information Networks , 2018, IEEE Journal on Selected Areas in Communications.

[8]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[9]  Randy Paffenroth,et al.  Multiobjective Reinforcement Learning for Cognitive Satellite Communications Using Deep Neural Network Ensembles , 2018, IEEE Journal on Selected Areas in Communications.

[10]  Victor C. M. Leung,et al.  Sensing Time Optimization and Power Control for Energy Efficient Cognitive Small Cell With Imperfect Hybrid Spectrum Sensing , 2017, IEEE Transactions on Wireless Communications.

[11]  Kin K. Leung,et al.  When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[12]  Xiaoying Gan,et al.  Social Crowdsourcing to Friends: An Incentive Mechanism for Multi-Resource Sharing , 2017, IEEE Journal on Selected Areas in Communications.

[13]  Marco Pavone,et al.  Cellular Network Traffic Scheduling With Deep Reinforcement Learning , 2018, AAAI.

[14]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[15]  Yonghui Song,et al.  A New Deep-Q-Learning-Based Transmission Scheduling Mechanism for the Cognitive Internet of Things , 2018, IEEE Internet of Things Journal.

[16]  Zhu Han,et al.  Resource Allocation for 5G Heterogeneous Cloud Radio Access Networks With D2D Communication: A Matching and Coalition Approach , 2018, IEEE Transactions on Vehicular Technology.

[17]  Nei Kato,et al.  State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow’s Intelligent Network Traffic Control Systems , 2017, IEEE Communications Surveys & Tutorials.

[18]  Hao Yu,et al.  Learning Aided Optimization for Energy Harvesting Devices with Outdated State Information , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[19]  Alagan Anpalagan,et al.  DISCO: Interference-Aware Distributed Cooperation with Incentive Mechanism for 5G Heterogeneous Ultra-Dense Networks , 2018, IEEE Communications Magazine.

[20]  David B. Smith,et al.  A Nash Stable Cross-Layer Coalitional Game for Resource Utilization in Device-to-Device Communications , 2018, IEEE Transactions on Vehicular Technology.

[21]  S. G. Ponnambalam,et al.  Heuristics-Based Trust Estimation in Multiagent Systems Using Temporal Difference Learning , 2017, IEEE Transactions on Cybernetics.

[22]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[23]  Xiaofeng Liao,et al.  Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information , 2017, IEEE Transactions on Cybernetics.

[24]  Qiang Yu,et al.  Multisource Transfer Double DQN Based on Actor Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Jun Cai,et al.  Distributed Multiuser Computation Offloading for Cloudlet-Based Mobile Cloud Computing: A Game-Theoretic Machine Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[26]  Tongwen Chen,et al.  False Data Injection Attacks on Networked Control Systems: A Stackelberg Game Analysis , 2018, IEEE Transactions on Automatic Control.

[27]  Lei Zhang,et al.  An adaptive mini-batch stochastic gradient method for AUC maximization , 2018, Neurocomputing.

[28]  Zdenek Becvar,et al.  Mobile Edge Computing: A Survey on Architecture and Computation Offloading , 2017, IEEE Communications Surveys & Tutorials.

[29]  Alireza Sadeghi,et al.  Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.

[30]  Feng Liu,et al.  AuTO: scaling deep reinforcement learning for datacenter-scale automatic traffic optimization , 2018, SIGCOMM.

[31]  Chi Harold Liu,et al.  Experience-driven Networking: A Deep Reinforcement Learning based Approach , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[32]  Junwei Gao,et al.  FMRQ—A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks , 2017, IEEE Transactions on Cybernetics.

[33]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[34]  Marco Lops,et al.  A Learning Approach for Low-Complexity Optimization of Energy Efficiency in Multicarrier Wireless Networks , 2018, IEEE Transactions on Wireless Communications.

[35]  Zhu Han,et al.  Dynamic Popular Content Distribution in Vehicular Networks using Coalition Formation Games , 2012, IEEE Journal on Selected Areas in Communications.

[36]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).