论文信息 - Deep Reinforcement Learning-aided Transmission Design for Multi-user V2V Networks

Deep Reinforcement Learning-aided Transmission Design for Multi-user V2V Networks

Intelligent connected vehicle (ICV) has been widely deemed as the key to reduce road accident rate and improve traffic efficiency. However, ensuring high communication reliability and low transmission delay in vehicular networks is challenging, especially in large-scale dynamic networks with diverse heterogeneous data exchange demands. In this paper, we investigate the potential of applying the deep reinforcement learning (DRL) technique to facilitate efficient transmission design in a class of complex multi-user vehicle-to-vehicle (V2V) networks, where conventional mathematical tools confront difficulties in solving the design optimization problems. The considered network contains several pairs of V2V links sharing the channel resource. Each link desires to communicate two types of delay-sensitive messages to support different safety-related applications with the maximum energy efficiency. We propose transforming the power/rate control problem into a Markov decision process and then solving it using the deep deterministic policy gradient (DDPG) algorithm. Simulation results show that in a two-user network our DRL-aided solution can achieve better performance than that with Lyapunov optimization. Extending the former to work in a larger network is straightforward, but it is not the case for the latter. The advantages of applying DRL to support wireless system design are thus demonstrated.

[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[2] Aymen Belghith,et al. A Reinforcement Learning-based Radio Resource Management Algorithm for D2D-based V2V Communication , 2019, 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC).

[3] IMT Vision – Framework and overall objectives of the future development of IMT for 2020 and beyond M Series Mobile , radiodetermination , amateur and related satellite services , 2015 .

[4] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.

[5] Nei Kato,et al. Future Intelligent and Secure Vehicular Network Toward 6G: Machine-Learning Approaches , 2020, Proceedings of the IEEE.

[6] Mugen Peng,et al. Deep-Reinforcement-Learning-Based Mode Selection and Resource Allocation for Cellular V2X Communications , 2020, IEEE Internet of Things Journal.

[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[8] Mate Boban,et al. VRLS: A Unified Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications , 2019, 2019 IEEE 2nd Connected and Automated Vehicles Symposium (CAVS).

[9] Victor C. M. Leung,et al. Joint User Scheduling and Power Allocation Optimization for Energy-Efficient NOMA Systems With Imperfect CSI , 2017, IEEE Journal on Selected Areas in Communications.

[10] Geoffrey Ye Li,et al. Toward Intelligent Vehicular Networks: A Machine Learning Framework , 2018, IEEE Internet of Things Journal.

[11] Yusheng Ji,et al. Power Control in D2D-Based Vehicular Communication Networks , 2015, IEEE Transactions on Vehicular Technology.

[12] Long D. Nguyen,et al. Distributed Deep Deterministic Policy Gradient for Power Allocation Control in D2D-Based V2V Communications , 2019, IEEE Access.

[13] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[14] Mikael Skoglund,et al. Fixed-Rate Transmission Over Fading Interference Channels Using Point-to-Point Gaussian Codes , 2015, IEEE Transactions on Communications.

[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[16] D Lan,et al. Transmission Design for Energy-Efficient Vehicular Networks with Multiple Delay-Limited Applications , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[17] Geoffrey Ye Li,et al. Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[18] Arumugam Nallanathan,et al. Reinforcement Learning for Real-Time Optimization in NB-IoT Networks , 2019, IEEE Journal on Selected Areas in Communications.