Multi-Agent Reinforcement Learning-Based Distributed Dynamic Spectrum Access

Dynamic spectrum access (DSA) is an effective solution for efficiently utilizing the radio spectrum by sharing it among various networks. Two primary tasks of a DSA controller are: 1) maximizing the quality of service of users in the licensee’s network and 2) avoiding interference in communications towards the incumbent network. These two tasks become quite challenging in a distributed DSA network due to the lack of a centralized controller to regulate the sharing of the radio spectrum between incumbents and licensees. Hence, optimization-driven techniques to design power allocation schemes in such a network often become intractable. Accordingly, in this paper, we present a distributed DSA based communication framework based on multi-agent reinforcement learning (RL), where the multiple cells in the multi-user multiple-input multiple-output (MU-MIMO) licensee network act as agents, and the average signal-to-noise ratio value is the reward. In particular, by considering the physical layer parameters of the DSA network, we analyze various RL algorithms, namely Q-learning, deep Q-network (DQN), deep deterministic policy gradient (DDPG), and twin delayed deep deterministic (TD3), whereby the licensee network learns to obtain the optimal power allocation policies for accessing the spectrum in a distributed fashion without the need for a central DSA controller to manage the interference towards the incumbent. Trade-offs are identified for the considered algorithms with respect to performance, time complexity and scalability of the DSA network.