论文信息 - Age of Information Aware Radio Resource Management in Vehicular Networks: A Proactive Deep Reinforcement Learning Perspective

Age of Information Aware Radio Resource Management in Vehicular Networks: A Proactive Deep Reinforcement Learning Perspective

In this paper, we investigate the problem of age of information (AoI)-aware radio resource management for expected long-term performance optimization in a Manhattan grid vehicle-to-vehicle network. With the observation of global network state at each scheduling slot, the roadside unit (RSU) allocates the frequency bands and schedules packet transmissions for all vehicle user equipment-pairs (VUE-pairs). We model the stochastic decision-making procedure as a discrete-time single-agent Markov decision process (MDP). The technical challenges in solving the optimal control policy originate from high spatial mobility and temporally varying traffic information arrivals of the VUE-pairs. To make the problem solving tractable, we first decompose the original MDP into a series of per-VUE-pair MDPs. Then we propose a proactive algorithm based on long short-term memory and deep reinforcement learning techniques to address the partial observability and the curse of high dimensionality in local network state space faced by each VUE-pair. With the proposed algorithm, the RSU makes the optimal frequency band allocation and packet scheduling decision at each scheduling slot in a decentralized way in accordance with the partial observations of the global network state at the VUE-pairs. Numerical experiments validate the theoretical analysis and demonstrate the significant performance improvements from the proposed algorithm.

[1] Ari Hottinen,et al. Optimizing Spatial and Temporal Reuse in Wireless Networks by Decentralized Partially Observable Markov Decision Processes , 2014, IEEE Transactions on Mobile Computing.

[2] Walid Saad,et al. Ultra-Reliable Low-Latency Vehicular Networks: Taming the Age of Information Tail , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[3] Bin Li,et al. Age-based Scheduling: Improving Data Freshness for Wireless Real-Time Traffic , 2018, MobiHoc.

[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5] Geoffrey Ye Li,et al. Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[6] Xianfu Chen,et al. Decentralized Deep Reinforcement Learning for Delay-Power Tradeoff in Vehicular Communications , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[7] Vangelis Angelakis,et al. Age of information of multiple sources with queue management , 2015, 2015 IEEE International Conference on Communications (ICC).

[8] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[9] Pablo Pavón-Mariño,et al. Distributed and Fair Beaconing Rate Adaptation for Congestion Control in Vehicular Networks , 2016, IEEE Transactions on Mobile Computing.

[10] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[11] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[12] Markus Fiedler,et al. A generic quantitative relationship between quality of experience and quality of service , 2010, IEEE Network.

[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15] Mehdi Bennis,et al. Ultra-Reliable and Low-Latency Vehicular Transmission: An Extreme Value Theory Approach , 2018, IEEE Communications Letters.

[16] Jianping Pan,et al. On the Uplink MAC Performance of a Drive-Thru Internet , 2012, IEEE Transactions on Vehicular Technology.

[17] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[18] Yusheng Ji,et al. Cluster-Based Content Distribution Integrating LTE and IEEE 802.11p with Fuzzy Logic and Q-Learning , 2018, IEEE Computational Intelligence Magazine.

[19] Mehdi Bennis,et al. Performance Optimization in Mobile-Edge Computing via Deep Reinforcement Learning , 2018, 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).

[20] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[21] Mehrdad Dianati,et al. A Survey of the State-of-the-Art Localization Techniques and Their Potentials for Autonomous Vehicle Applications , 2018, IEEE Internet of Things Journal.

[22] Xianfu Chen,et al. Deep Learning with Long Short-Term Memory for Time Series Prediction , 2018, IEEE Communications Magazine.

[23] Sanjit Krishnan Kaul,et al. Minimizing age of information in vehicular networks , 2011, 2011 8th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks.

[24] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[25] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[26] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[27] Erik G. Ström,et al. Radio Resource Management for D2D-Based V2V Communication , 2016, IEEE Transactions on Vehicular Technology.

[28] H. Vincent Poor,et al. Ultrareliable and Low-Latency Wireless Communication: Tail, Risk, and Scale , 2018, Proceedings of the IEEE.

[29] Geoffrey Ye Li,et al. Resource Allocation for Low-Latency Vehicular Communications: An Effective Capacity Perspective , 2019, IEEE Journal on Selected Areas in Communications.

[30] C. L. Philip Chen,et al. Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31] Song Guo,et al. Vehicle-Assist Resilient Information and Network System for Disaster Management , 2017, IEEE Transactions on Emerging Topics in Computing.

[32] Hannes Hartenstein,et al. A validated 5.9 GHz Non-Line-of-Sight path-loss and fading model for inter-vehicle communication , 2011, 2011 11th International Conference on ITS Telecommunications.

[33] Geoffrey Ye Li,et al. Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[34] Antonio Iera,et al. 5G Network Slicing for Vehicle-to-Everything Services , 2017, IEEE Wireless Communications.

[35] Kobi Cohen,et al. Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[36] Yiqing Zhou,et al. Heterogeneous Vehicular Networking: A Survey on Architecture, Challenges, and Solutions , 2015, IEEE Communications Surveys & Tutorials.

[37] Geoffrey Ye Li,et al. Machine Learning for Vehicular Networks: Recent Advances and Application Examples , 2018, IEEE Vehicular Technology Magazine.

[38] Mehdi Bennis,et al. Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[39] Weiqin Tong,et al. Collaboration of Heterogeneous Unmanned Vehicles for Smart Cities , 2019, IEEE Network.

[40] Eytan Modiano,et al. Minimizing the Age of Information in Wireless Networks with Stochastic Arrivals , 2019, IEEE Transactions on Mobile Computing.

[41] Albert Banchs,et al. Mobile network architecture evolution toward 5G , 2016, IEEE Communications Magazine.

[42] Antonella Molinaro,et al. Information-centric networking for connected vehicles: a survey and future perspectives , 2016, IEEE Communications Magazine.

[43] Victor C. M. Leung,et al. Delay-Optimal Virtualized Radio Resource Scheduling in Software-Defined Vehicular Networks via Stochastic Learning , 2016, IEEE Transactions on Vehicular Technology.

[44] Daniel Adelman,et al. Relaxations of Weakly Coupled Stochastic Dynamic Programs , 2008, Oper. Res..

[45] Xue Liu,et al. LORA: Loss Differentiation Rate Adaptation Scheme for Vehicle-to-Vehicle Safety Communications , 2017, IEEE Transactions on Vehicular Technology.

[46] S. P. Lloyd,et al. Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[47] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[48] Hariharan Krishnan,et al. V2V System Congestion Control Validation and Performance , 2019, IEEE Transactions on Vehicular Technology.

[49] Zhu Han,et al. Learning to Entangle Radio Resources in Vehicular Communications: An Oblivious Game-Theoretic Perspective , 2019, IEEE Transactions on Vehicular Technology.

[50] Mehdi Bennis,et al. Optimized Computation Offloading Performance in Virtual Edge Computing Systems Via Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[51] Marwan Krunz,et al. Driving in the Fog: Latency Measurement, Modeling, and Optimization of LTE-based Fog Computing for Smart Vehicles , 2019, 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[52] Ulrike von Luxburg,et al. A tutorial on spectral clustering , 2007, Stat. Comput..

[53] Xuemin Shen,et al. Dynamic Cell Association for Non-Orthogonal Multiple-Access V2S Networks , 2017, IEEE Journal on Selected Areas in Communications.

[54] Ahmed Syed Irshad,et al. Markov Decision Process , 2011 .