UAV-Enabled Secure Communications by Multi-Agent Deep Reinforcement Learning

Unmanned aerial vehicles (UAVs) can be employed as aerial base stations to support communication for the ground users (GUs). However, the aerial-to-ground (A2G) channel link is dominated by line-of-sight (LoS) due to the high flying altitude, which is easily wiretapped by the ground eavesdroppers (GEs). In this case, a single UAV has limited maneuvering capacity to obtain the desired secure rate in the presence of multiple eavesdroppers. In this paper, we propose a cooperative jamming approach by letting UAV jammers help the UAV transmitter defend against GEs. To be specific, the UAV transmitter sends the confidential information to GUs, and the UAV jammers send the artificial noise signals to the GEs by 3D beamforming. We propose a multi-agent deep reinforcement learning (MADRL) approach, i.e., multi-agent deep deterministic policy gradient (MADDPG) to maximize the secure capacity by jointly optimizing the trajectory of UAVs, the transmit power from UAV transmitter and the jamming power from the UAV jammers. The MADDPG algorithm adopts centralized training and distributed execution. The simulation results show the MADRL method can realize the joint trajectory design of UAVs and achieve good performance. To improve the learning efficiency and convergence, we further propose a continuous action attention MADDPG (CAA-MADDPG) method, where the agent learns to pay attention to the actions and observations of other agents that are more relevant with it. From the simulation results, the rewards performance of CAA-MADDPG is better than the MADDPG without attention.

[1]  Mohamed-Slim Alouini,et al.  Multiple UAVs as Relays: Multi-Hop Single Link Versus Multiple Dual-Hop Links , 2018, IEEE Transactions on Wireless Communications.

[2]  Walid Saad,et al.  Interference Management for Cellular-Connected UAVs: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Wireless Communications.

[3]  Fei Sha,et al.  Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.

[4]  Geoffrey Ye Li,et al.  Deep Learning-Based CSI Feedback Approach for Time-Varying Massive MIMO Channels , 2018, IEEE Wireless Communications Letters.

[5]  Geoffrey Ye Li,et al.  Deep Learning-Based Channel Estimation for Beamspace mmWave Massive MIMO Systems , 2018, IEEE Wireless Communications Letters.

[6]  WestRichard,et al.  Reinforcement Learning for UAV Attitude Control , 2019 .

[7]  Qihui Wu,et al.  An Amateur Drone Surveillance System Based on the Cognitive Internet of Things , 2017, IEEE Communications Magazine.

[8]  Jun Li,et al.  UAV-Enabled Secure Communications: Joint Trajectory and Transmit Power Optimization , 2019, IEEE Transactions on Vehicular Technology.

[9]  Ryu Miura,et al.  AC-POCA: Anticoordination Game Based Partially Overlapping Channels Assignment in Combined UAV and D2D-Based Networks , 2017, IEEE Transactions on Vehicular Technology.

[10]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[11]  Jie Xu,et al.  Secure UAV Communication With Cooperative Jamming and Trajectory Control , 2018, IEEE Communications Letters.

[12]  Yunfei Chen,et al.  UAV-Relaying-Assisted Secure Transmission With Caching , 2019, IEEE Transactions on Communications.

[13]  Nei Kato,et al.  Future Intelligent and Secure Vehicular Network Toward 6G: Machine-Learning Approaches , 2020, Proceedings of the IEEE.

[14]  Mugen Peng,et al.  Mode Selection and Resource Allocation in Sliced Fog Radio Access Networks: A Reinforcement Learning Approach , 2020, IEEE Transactions on Vehicular Technology.

[15]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Shi Jin,et al.  Beam Tracking for UAV Mounted SatCom on-the-Move With Massive Antenna Array , 2017, IEEE Journal on Selected Areas in Communications.

[18]  Zhen Xiao,et al.  Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG , 2018, AAMAS.

[19]  A. Lee Swindlehurst,et al.  Jamming Games in the MIMO Wiretap Channel With an Active Eavesdropper , 2010, IEEE Transactions on Signal Processing.

[20]  Jin Chen,et al.  Power Control in UAV-Supported Ultra Dense Networks: Communications, Caching, and Energy Transfer , 2017, IEEE Communications Magazine.

[21]  Azer Bestavros,et al.  Reinforcement Learning for UAV Attitude Control , 2018, ACM Trans. Cyber Phys. Syst..

[22]  Shi Jin,et al.  Deep Learning for Massive MIMO CSI Feedback , 2017, IEEE Wireless Communications Letters.

[23]  Liang Liu,et al.  Secrecy wireless information and power transfer in fading wiretap channel , 2014, 2014 IEEE International Conference on Communications (ICC).

[24]  Zhu Han,et al.  Spectrum Sharing Planning for Full-Duplex UAV Relaying Systems With Underlaid D2D Communications , 2018, IEEE Journal on Selected Areas in Communications.

[25]  Kandeepan Sithamparanathan,et al.  Optimal LAP Altitude for Maximum Coverage , 2014, IEEE Wireless Communications Letters.

[26]  Yuan Shen,et al.  Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Vehicular Technology.

[27]  David Grace,et al.  High-altitude platforms for wireless communications , 2001 .

[28]  Zhenghua Chen,et al.  Using Reinforcement Learning to Minimize the Probability of Delay Occurrence in Transportation , 2020, IEEE Transactions on Vehicular Technology.

[29]  Weidang Lu,et al.  UAV-Assisted Emergency Networks in Disasters , 2019, IEEE Wireless Communications.

[30]  Victor C. M. Leung,et al.  UAV Trajectory Optimization for Data Offloading at the Edge of Multiple Cells , 2018, IEEE Transactions on Vehicular Technology.

[31]  Tao Zhang,et al.  Integrating Communications and Control for UAV Systems: Opportunities and Challenges , 2018, IEEE Access.

[32]  Abbas Mohammed,et al.  The Role of High-Altitude Platforms (HAPs) in the Global Wireless Connectivity , 2011, Proceedings of the IEEE.

[33]  Xiang-Gen Xia,et al.  3-D Beamforming for Flexible Coverage in Millimeter-Wave UAV Communications , 2019, IEEE Wireless Communications Letters.

[34]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[35]  Zhu Han,et al.  Real-Time Profiling of Fine-Grained Air Quality Index Distribution Using UAV Sensing , 2017, IEEE Internet of Things Journal.

[36]  Ryu Miura,et al.  On A Novel Adaptive UAV-Mounted Cloudlet-Aided Recommendation System for LBSNs , 2019, IEEE Transactions on Emerging Topics in Computing.

[37]  A. Lee Swindlehurst,et al.  Detecting passive eavesdroppers in the MIMO wiretap channel , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).