A stochastic game framework for joint frequency and power allocation in dynamic decentralized cognitive radio networks

Abstract Cognitive radio networks (CRNs) have been recognized as a promising solution to improve the radio spectrum utilization. This article investigates a novel issue of joint frequency and power allocation in decentralized CRNs with dynamic or time-varying spectrum resources. We firstly model the interactions between decentralized cognitive radio links as a stochastic game and then proposed a strategy learning algorithm which effectively integrates multi-agent frequency strategy learning and power pricing. The convergence of the proposed algorithm to Nash equilibrium is proofed theoretically. Simulation results demonstrate that the throughput performance of the proposed algorithm is very close to that of the centralized optimal learning algorithm, while the proposed algorithm could be implemented distributively and reduce information exchanges significantly.

[1]  Vikram Krishnamurthy,et al.  Transmission control in cognitive radio as a Markovian dynamic game: Structural result on randomized threshold policies , 2010, IEEE Transactions on Communications.

[2]  Qihui Wu,et al.  Spatial-Temporal Opportunity Detection for Spectrum-Heterogeneous Cognitive Radio Networks: Two-Dimensional Sensing , 2013, IEEE Transactions on Wireless Communications.

[3]  Halim Yanikomeroglu,et al.  Access Strategies for Spectrum Sharing in Fading Environment: Overlay, Underlay, and Mixed , 2010, IEEE Transactions on Mobile Computing.

[4]  Ranjan K. Mallik,et al.  A game-theoretic approach for distributed power control in interference relay channels , 2009, IEEE Transactions on Wireless Communications.

[5]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[6]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[7]  Zhi Ding,et al.  Decentralized Cognitive Radio Control Based on Inference from Primary Link Control Information , 2011, IEEE Journal on Selected Areas in Communications.

[8]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[9]  Qihui Wu,et al.  Joint Frequency and Power Allocation in Wireless Mesh Networks: A Self-Pricing Game Model , 2011, IEICE Trans. Commun..

[10]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[11]  Keith B. Hall,et al.  Correlated Q-Learning , 2003, ICML.

[12]  Yuguang Fang,et al.  Joint Channel and Power Allocation in Wireless Mesh Networks: A Game Theoretical Perspective , 2008, IEEE Journal on Selected Areas in Communications.

[13]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[15]  K. J. Ray Liu,et al.  An anti-jamming stochastic game for cognitive radio networks , 2011, IEEE Journal on Selected Areas in Communications.

[16]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[17]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  Yusun Chang,et al.  Reinforcement Learning for Repeated Power Control Game in Cognitive Radio Networks , 2012, IEEE Journal on Selected Areas in Communications.

[19]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[20]  Alagan Anpalagan,et al.  Opportunistic Spectrum Access in Unknown Dynamic Environment: A Game-Theoretic Stochastic Learning Solution , 2012, IEEE Transactions on Wireless Communications.

[21]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[22]  Ville Könönen,et al.  Asymmetric multiagent reinforcement learning , 2003, Web Intell. Agent Syst..

[23]  Mihaela van der Schaar,et al.  Learning to Compete for Resources in Wireless Stochastic Games , 2009, IEEE Transactions on Vehicular Technology.

[24]  John N. Tsitsiklis,et al.  Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.

[25]  Jeffrey S. Rosenschein,et al.  Best-response multiagent learning in non-stationary environments , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[26]  Moh Lim Sim,et al.  Game theoretic approach for channel assignment and power control with no-internal-regret learning in wireless ad hoc networks , 2008, IET Commun..

[27]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[28]  Roger B. Myerson,et al.  Game theory - Analysis of Conflict , 1991 .

[29]  Allen B. MacKenzie,et al.  Game Theory for Wireless Engineers , 2006, Game Theory for Wireless Engineers.