Deep Reinforcement Learning for Joint Channel Selection and Power Control in D2D Networks

Device-to-device (D2D) technology, which allows direct communications between proximal devices, is widely acknowledged as a promising candidate to alleviate the mobile traffic explosion problem. In this paper, we consider an overlay D2D network, in which multiple D2D pairs coexist on several orthogonal spectrum bands, i.e., channels. Due to spectrum scarcity, the number of D2D pairs is typically more than that of available channels, and thus multiple D2D pairs may use a single channel simultaneously. This may lead to severe co-channel interference and degrade network performance. To deal with this issue, we formulate a joint channel selection and power control optimization problem, with the aim to maximize the weighted-sum-rate (WSR) of the D2D network. Unfortunately, this problem is non-convex and NP-hard. To solve this problem, we first adopt the state-of-art fractional programming (FP) technique and develop an FP-based algorithm to obtain a near-optimal solution. However, the FP-based algorithm requires instantaneous global channel state information (CSI) for centralized processing, resulting in poor scalability and prohibitively high signalling overheads. Therefore, we further propose a distributed deep reinforcement learning (DRL)-based scheme, with which D2D pairs can autonomously optimize channel selection and transmit power by only exploiting local information and outdated nonlocal information. Compared with the FP-based algorithm, the DRL-based scheme can achieve better scalability and reduce signalling overheads significantly. Simulation results demonstrate that even without instantaneous global CSI, the performance of the DRL-based scheme can approach closely to that of the FP-based algorithm.

[1]  Ying-Chang Liang,et al.  Intelligent Sharing for LTE and WiFi Systems in Unlicensed Bands: A Deep Reinforcement Learning Approach , 2020, IEEE Transactions on Communications.

[2]  Ying-Chang Liang,et al.  Deep Reinforcement Learning for Channel Selection and Power Control in D2D Networks , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[3]  Ying-Chang Liang,et al.  6G Visions: Mobile ultra-broadband, super internet-of-things, and artificial intelligence , 2019, China Communications.

[4]  Yiyang Pei,et al.  Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Cellular Networks , 2019, IEEE Transactions on Wireless Communications.

[5]  Filippos Christianos,et al.  Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning , 2019, ArXiv.

[6]  Walid Saad,et al.  Interference Management for Cellular-Connected UAVs: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Wireless Communications.

[7]  Ying-Chang Liang,et al.  Deep Reinforcement Learning-Based Modulation and Coding Scheme Selection in Cognitive Heterogeneous Networks , 2018, IEEE Transactions on Wireless Communications.

[8]  Ying-Chang Liang,et al.  Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[9]  Dongning Guo,et al.  Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks , 2018, IEEE Journal on Selected Areas in Communications.

[10]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[11]  Chi Harold Liu,et al.  Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[12]  Weidong Wang,et al.  Deep Reinforcement Learning Based Dynamic Channel Allocation Algorithm in Multibeam Satellite Systems , 2018, IEEE Access.

[13]  Wei Yu,et al.  Fractional Programming for Communication Systems—Part I: Power Control and Beamforming , 2018, IEEE Transactions on Signal Processing.

[14]  Wei Yu,et al.  Fractional Programming for Communication Systems—Part II: Uplink Scheduling via Matching , 2018, IEEE Transactions on Signal Processing.

[15]  Bhaskar Krishnamachari,et al.  Deep Reinforcement Learning for Dynamic Multichannel Access in Wireless Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.

[16]  Zhi Chen,et al.  Intelligent Power Control for Spectrum Sharing in Cognitive Radios: A Deep Reinforcement Learning Approach , 2017, IEEE Access.

[17]  Soung Chang Liew,et al.  Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks , 2017, 2018 IEEE International Conference on Communications (ICC).

[18]  Ming Xiao,et al.  A Survey of Advanced Techniques for Spectrum Sharing in 5G Networks , 2017, IEEE Wireless Communications.

[19]  Victor C. M. Leung,et al.  Deep-Reinforcement-Learning-Based Optimization for Cache-Enabled Opportunistic Interference Alignment Wireless Networks , 2017, IEEE Transactions on Vehicular Technology.

[20]  Zi-Yang Yang,et al.  Efficient resource allocation algorithm for overlay D2D communication , 2017, Comput. Networks.

[21]  Yuanzhi Li,et al.  Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.

[22]  Andrea Abrardo,et al.  Distributed Power Allocation for D2D Communications Underlaying/Overlaying OFDMA Cellular Networks , 2016, IEEE Transactions on Wireless Communications.

[23]  Khaled Ben Letaief,et al.  Optimal QoS-Aware Channel Assignment in D2D Communications With Partial CSI , 2016, IEEE Transactions on Wireless Communications.

[24]  Xiaohu You,et al.  Energy-Efficient Joint Resource Allocation and Power Control for D2D Communications , 2016, IEEE Transactions on Vehicular Technology.

[25]  Caijun Zhong,et al.  Joint Spectrum and Power Allocation for D2D Communications Underlaying Cellular Networks , 2016, IEEE Transactions on Vehicular Technology.

[26]  Fan-Min Tseng,et al.  3GPP device-to-device communications for beyond 4G cellular networks , 2016, IEEE Communications Magazine.

[27]  Yoshikazu Miyanaga,et al.  QoS-Oriented Mode, Spectrum, and Power Allocation for D2D Communication Underlaying LTE-A Network , 2016, IEEE Transactions on Vehicular Technology.

[28]  Lawrence Wai-Choong Wong,et al.  A Stackelberg Game Model for Overlay D2D Transmission With Heterogeneous Rate Requirements , 2015, IEEE Transactions on Vehicular Technology.

[29]  Li Wang,et al.  Device-to-Device Communications in Cellular Networks , 2016, SpringerBriefs in Electrical and Computer Engineering.

[30]  C. Siva Ram Murthy,et al.  A novel spectrum reuse scheme for interference mitigation in a dense overlay D2D network , 2015, 2015 IEEE 26th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[31]  Cong Xiong,et al.  Mode Switching for Energy-Efficient Device-to-Device Communications in Cellular Networks , 2015, IEEE Transactions on Wireless Communications.

[32]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[33]  Nei Kato,et al.  Device-to-Device Communication in LTE-Advanced Networks: A Survey , 2015, IEEE Communications Surveys & Tutorials.

[34]  Qing Wang,et al.  A Survey on Device-to-Device Communication in Cellular Networks , 2013, IEEE Communications Surveys & Tutorials.

[35]  S. Parkvall,et al.  Design aspects of network assisted device-to-device communications , 2012, IEEE Communications Magazine.

[36]  Bruno Clerckx,et al.  Does Frequent Low Resolution Feedback Outperform Infrequent High Resolution Feedback for Multiple Antenna Beamforming Systems? , 2011, IEEE Transactions on Signal Processing.

[37]  Zhi-Quan Luo,et al.  Dynamic Spectrum Management: Complexity and Duality , 2008, IEEE Journal of Selected Topics in Signal Processing.

[38]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[39]  R. Stephenson A and V , 1962, The British journal of ophthalmology.