论文信息 - Correlation-aware Cooperative Multigroup Broadcast 360{\deg} Video Delivery Network: A Hierarchical Deep Reinforcement Learning Approach

Correlation-aware Cooperative Multigroup Broadcast 360{\deg} Video Delivery Network: A Hierarchical Deep Reinforcement Learning Approach

With the stringent requirement of receiving video from unmanned aerial vehicle (UAV) from anywhere in the stadium of sports events and the significant-high per-cell throughput for video transmission to virtual reality (VR) users, a promising solution is a cell-free multi-group broadcast (CF-MB) network with cooperative reception and broadcast access points (AP). To explore the benefit of broadcasting user-correlated decode-dependent video resources to spatially correlated VR users, the network should dynamically schedule the video and cluster APs into virtual cells for a different group of VR users with overlapped video requests. By decomposition the problem into scheduling and association sub-problems, we first introduce the conventional non-learning-based scheduling and association algorithms, and a centralized deep reinforcement learning (DRL) association approach based on the rainbow agent with a convolutional neural network (CNN) to generate decisions from observation. To reduce its complexity, we then decompose the association problem into multiple sub-problems, resulting in a networked-distributed Partially Observable Markov decision process (ND-POMDP). To solve it, we propose a multi-agent deep DRL algorithm. To jointly solve the coupled association and scheduling problems, we further develop a hierarchical federated DRL algorithm with scheduler as meta-controller, and association as the controller. Our simulation results shown that our CF-MB network can effectively handle real-time video transmission from UAVs to VR users. Our proposed learning architectures is effective and scalable for a high-dimensional cooperative association problem with increasing APs and VR users. Also, our proposed algorithms outperform non-learning based methods with significant performance improvement.

[1] Erik G. Larsson,et al. Cell-Free Massive MIMO Versus Small Cells , 2016, IEEE Transactions on Wireless Communications.

[2] Walid Saad,et al. Mobile Unmanned Aerial Vehicles (UAVs) for Energy-Efficient Internet of Things Communications , 2017, IEEE Transactions on Wireless Communications.

[3] Mehdi Bennis,et al. Communication-Efficient Massive UAV Online Path Control: Federated Learning Meets Mean-Field Game Theory , 2020, IEEE Transactions on Communications.

[4] Hamid Aghvami,et al. Cellular-Connected Wireless Virtual Reality: Requirements, Challenges, and Solutions , 2020, IEEE Communications Magazine.

[5] Yumei Wang,et al. A Flexible Viewport-Adaptive Processing Mechanism for Real-Time VR Video Transmission , 2019, 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[6] H. Vincent Poor,et al. Convergence Time Optimization for Federated Learning Over Wireless Networks , 2020, IEEE Transactions on Wireless Communications.

[7] Sujit Dey,et al. Head and Body Motion Prediction to Enable Mobile VR Experiences with Low Latency , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[8] John F. Canny,et al. Measuring the Reliability of Reinforcement Learning Algorithms , 2019, ICLR.

[9] Emil Björnson,et al. Dynamic Resource Allocation in Co-Located and Cell-Free Massive MIMO , 2019, IEEE Transactions on Green Communications and Networking.

[10] Klara Nahrstedt,et al. Scalable 360° Video Stream Delivery: Challenges, Solutions, and Opportunities , 2019, Proceedings of the IEEE.

[11] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.

[12] Walid Saad,et al. Deep Learning for 360° Content Transmission in UAV-Enabled Virtual Reality , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[14] Emil Björnson,et al. Intelligent Reflecting Surface Versus Decode-and-Forward: How Large Surfaces are Needed to Beat Relaying? , 2019, IEEE Wireless Communications Letters.

[15] Walid Saad,et al. Data Correlation-Aware Resource Management in Wireless Virtual Reality (VR): An Echo State Transfer Learning Approach , 2019, IEEE Transactions on Communications.

[16] Wei Cui,et al. Spatial Deep Learning for Wireless Scheduling , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[17] Abdulmotaleb El-Saddik,et al. Edge Caching and Computing in 5G for Mobile AR/VR and Tactile Internet , 2019, IEEE MultiMedia.

[18] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.

[19] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[20] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[21] Sumei Sun,et al. Multicast Linear Precoding for MIMO-OFDM Systems , 2015, IEEE Communications Letters.

[22] Stefano Buzzi,et al. User-Centric Cell-Free Massive MIMO with Interference Cancellation and Local ZF Downlink Precoding , 2018, 2018 15th International Symposium on Wireless Communication Systems (ISWCS).

[23] Xiaoming Tao,et al. Viewport Proposal CNN for 360° Video Quality Assessment , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Qiang Li,et al. Multipath Cooperative Communications Networks for Augmented and Virtual Reality Transmission , 2017, IEEE Transactions on Multimedia.

[25] Nurul H. Mahmood,et al. 5G Centralized Multi-Cell Scheduling for URLLC: Algorithms and System-Level Performance , 2018, IEEE Access.

[26] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.

[27] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .

[28] Ming Zhou,et al. Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[29] N. K. Shankaranarayanan,et al. Exploiting Mobility in Proportional Fair Cellular Scheduling: Measurements and Algorithms , 2014, IEEE/ACM Transactions on Networking.

[30] Harpreet S. Dhillon,et al. Poisson cluster process: Bridging the gap between PPP and 3GPP HetNet models , 2017, 2017 Information Theory and Applications Workshop (ITA).

[31] Walid Saad,et al. Unmanned Aerial Vehicle With Underlaid Device-to-Device Communications: Performance and Tradeoffs , 2015, IEEE Transactions on Wireless Communications.

[32] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[33] Stefano Buzzi,et al. Cell-Free Massive MIMO: User-Centric Approach , 2017, IEEE Wireless Communications Letters.

[34] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[35] Yong Zhao,et al. Communication-Constrained Mobile Edge Computing Systems for Wireless Virtual Reality: Scheduling and Tradeoff , 2018, IEEE Access.

[36] Erik G. Larsson,et al. Massive MIMO Performance—TDD Versus FDD: What Do Measurements Say? , 2017, IEEE Transactions on Wireless Communications.

[37] Mehdi Bennis,et al. Taming the Latency in Multi-User VR 360°: A QoE-Aware Deep Learning-Aided Multicast Framework , 2018, IEEE Transactions on Communications.

[38] Jeffrey G. Andrews,et al. Multi-Antenna Communication in Ad Hoc Networks: Achieving MIMO Gains with SIMO Transmission , 2008, IEEE Transactions on Communications.