Multi-Agent Reinforcement Learning for Joint Channel Assignment and Power Allocation in Platoon-Based C-V2X Systems

We consider the problem of joint channel assignment and power allocation in underlaid cellular vehicular-to-everything (C-V2X) systems where multiple vehicle-to-infrastructure (V2I) uplinks share the time-frequency resources with multiple vehicle-to-vehicle (V2V) platoons that enable groups of connected and autonomous vehicles to travel closely together. Due to the nature of fast channel variant in vehicular environment, traditional centralized optimization approach relying on global channel information might not be viable in C-V2X systems with large number of users. Utilizing a reinforcement learning (RL) approach, we propose a distributed resource allocation (RA) algorithm to overcome this challenge. Specifically, we model the RA problem as a multi-agent system. Based solely on the local channel information, each platoon leader, who acts as an agent, collectively interacts with each other and accordingly selects the optimal combination of sub-band and power level to transmit its signals. Toward this end, we utilize the double deep Q-learning algorithm to jointly train the agents under the objectives of simultaneously maximizing the V2I sum-rate and satisfying the packet delivery probability of each V2V link in a desired latency limitation. Simulation results show that our proposed RL-based algorithm achieves a close performance compared to that of the well-known exhaustive search algorithm.

[1]  Shimon Whiteson,et al.  Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.

[2]  Erik G. Ström,et al.  Radio Resource Management for D2D-Based V2V Communication , 2016, IEEE Transactions on Vehicular Technology.

[3]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning based Resource Allocation for V 2 V Communications , 2018 .

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Erik G. Ström,et al.  Cluster-Based Radio Resource Management for D2D-Supported Safety-Critical V2X Communications , 2016, IEEE Transactions on Wireless Communications.

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Shih-Chun Lin,et al.  Dynamic Power Allocation and Virtual Cell Formation for Throughput-Optimal Vehicular Edge Networks in Highway Transportation , 2020, 2020 IEEE International Conference on Communications Workshops (ICC Workshops).

[8]  Xiaojiang Du,et al.  A Reinforcement Learning Method for Joint Mode Selection and Power Adaptation in the V2V Communication Network in 5G , 2020, IEEE Transactions on Cognitive Communications and Networking.

[9]  Xiang Cheng,et al.  Interference Graph-Based Resource-Sharing Schemes for Vehicular Networks , 2013, IEEE Transactions on Vehicular Technology.

[10]  Antonio Iera,et al.  LTE for vehicular networking: a survey , 2013, IEEE Communications Magazine.

[11]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[12]  Geoffrey Ye Li,et al.  Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[13]  Geoffrey Ye Li,et al.  Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[14]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[15]  Zhi Ding,et al.  Graph-Based Resource Sharing in Vehicular Communication , 2018, IEEE Transactions on Wireless Communications.

[16]  R. Bellman Dynamic programming. , 1957, Science.

[17]  Robert W. Heath,et al.  Modeling and Analysis of MmWave V2X Networks With Vehicular Platoon Systems , 2019, IEEE Journal on Selected Areas in Communications.

[18]  Geoffrey Ye Li,et al.  Resource Allocation for D2D-Enabled Vehicular Communications , 2017, IEEE Transactions on Communications.