Age of Information Aware Radio Resource Management in Vehicular Networks: A Proactive Deep Reinforcement Learning Perspective

In this paper, we investigate the problem of age of information (AoI)-aware radio resource management for expected long-term performance optimization in a Manhattan grid vehicle-to-vehicle network. With the observation of global network state at each scheduling slot, the roadside unit (RSU) allocates the frequency bands and schedules packet transmissions for all vehicle user equipment-pairs (VUE-pairs). We model the stochastic decision-making procedure as a discrete-time single-agent Markov decision process (MDP). The technical challenges in solving the optimal control policy originate from high spatial mobility and temporally varying traffic information arrivals of the VUE-pairs. To make the problem solving tractable, we first decompose the original MDP into a series of per-VUE-pair MDPs. Then we propose a proactive algorithm based on long short-term memory and deep reinforcement learning techniques to address the partial observability and the curse of high dimensionality in local network state space faced by each VUE-pair. With the proposed algorithm, the RSU makes the optimal frequency band allocation and packet scheduling decision at each scheduling slot in a decentralized way in accordance with the partial observations of the global network state at the VUE-pairs. Numerical experiments validate the theoretical analysis and demonstrate the significant performance improvements from the proposed algorithm.

[1]  Ari Hottinen,et al.  Optimizing Spatial and Temporal Reuse in Wireless Networks by Decentralized Partially Observable Markov Decision Processes , 2014, IEEE Transactions on Mobile Computing.

[2]  Walid Saad,et al.  Ultra-Reliable Low-Latency Vehicular Networks: Taming the Age of Information Tail , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[3]  Bin Li,et al.  Age-based Scheduling: Improving Data Freshness for Wireless Real-Time Traffic , 2018, MobiHoc.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Geoffrey Ye Li,et al.  Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[6]  Xianfu Chen,et al.  Decentralized Deep Reinforcement Learning for Delay-Power Tradeoff in Vehicular Communications , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[7]  Vangelis Angelakis,et al.  Age of information of multiple sources with queue management , 2015, 2015 IEEE International Conference on Communications (ICC).

[8]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[9]  Pablo Pavón-Mariño,et al.  Distributed and Fair Beaconing Rate Adaptation for Congestion Control in Vehicular Networks , 2016, IEEE Transactions on Mobile Computing.

[10]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[11]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[12]  Markus Fiedler,et al.  A generic quantitative relationship between quality of experience and quality of service , 2010, IEEE Network.

[13]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Mehdi Bennis,et al.  Ultra-Reliable and Low-Latency Vehicular Transmission: An Extreme Value Theory Approach , 2018, IEEE Communications Letters.

[16]  Jianping Pan,et al.  On the Uplink MAC Performance of a Drive-Thru Internet , 2012, IEEE Transactions on Vehicular Technology.

[17]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[18]  Yusheng Ji,et al.  Cluster-Based Content Distribution Integrating LTE and IEEE 802.11p with Fuzzy Logic and Q-Learning , 2018, IEEE Computational Intelligence Magazine.

[19]  Mehdi Bennis,et al.  Performance Optimization in Mobile-Edge Computing via Deep Reinforcement Learning , 2018, 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).

[20]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[21]  Mehrdad Dianati,et al.  A Survey of the State-of-the-Art Localization Techniques and Their Potentials for Autonomous Vehicle Applications , 2018, IEEE Internet of Things Journal.

[22]  Xianfu Chen,et al.  Deep Learning with Long Short-Term Memory for Time Series Prediction , 2018, IEEE Communications Magazine.

[23]  Sanjit Krishnan Kaul,et al.  Minimizing age of information in vehicular networks , 2011, 2011 8th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks.

[24]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[25]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Erik G. Ström,et al.  Radio Resource Management for D2D-Based V2V Communication , 2016, IEEE Transactions on Vehicular Technology.

[28]  H. Vincent Poor,et al.  Ultrareliable and Low-Latency Wireless Communication: Tail, Risk, and Scale , 2018, Proceedings of the IEEE.

[29]  Geoffrey Ye Li,et al.  Resource Allocation for Low-Latency Vehicular Communications: An Effective Capacity Perspective , 2019, IEEE Journal on Selected Areas in Communications.

[30]  C. L. Philip Chen,et al.  Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Song Guo,et al.  Vehicle-Assist Resilient Information and Network System for Disaster Management , 2017, IEEE Transactions on Emerging Topics in Computing.

[32]  Hannes Hartenstein,et al.  A validated 5.9 GHz Non-Line-of-Sight path-loss and fading model for inter-vehicle communication , 2011, 2011 11th International Conference on ITS Telecommunications.

[33]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[34]  Antonio Iera,et al.  5G Network Slicing for Vehicle-to-Everything Services , 2017, IEEE Wireless Communications.

[35]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[36]  Yiqing Zhou,et al.  Heterogeneous Vehicular Networking: A Survey on Architecture, Challenges, and Solutions , 2015, IEEE Communications Surveys & Tutorials.

[37]  Geoffrey Ye Li,et al.  Machine Learning for Vehicular Networks: Recent Advances and Application Examples , 2018, IEEE Vehicular Technology Magazine.

[38]  Mehdi Bennis,et al.  Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[39]  Weiqin Tong,et al.  Collaboration of Heterogeneous Unmanned Vehicles for Smart Cities , 2019, IEEE Network.

[40]  Eytan Modiano,et al.  Minimizing the Age of Information in Wireless Networks with Stochastic Arrivals , 2019, IEEE Transactions on Mobile Computing.

[41]  Albert Banchs,et al.  Mobile network architecture evolution toward 5G , 2016, IEEE Communications Magazine.

[42]  Antonella Molinaro,et al.  Information-centric networking for connected vehicles: a survey and future perspectives , 2016, IEEE Communications Magazine.

[43]  Victor C. M. Leung,et al.  Delay-Optimal Virtualized Radio Resource Scheduling in Software-Defined Vehicular Networks via Stochastic Learning , 2016, IEEE Transactions on Vehicular Technology.

[44]  Daniel Adelman,et al.  Relaxations of Weakly Coupled Stochastic Dynamic Programs , 2008, Oper. Res..

[45]  Xue Liu,et al.  LORA: Loss Differentiation Rate Adaptation Scheme for Vehicle-to-Vehicle Safety Communications , 2017, IEEE Transactions on Vehicular Technology.

[46]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[47]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[48]  Hariharan Krishnan,et al.  V2V System Congestion Control Validation and Performance , 2019, IEEE Transactions on Vehicular Technology.

[49]  Zhu Han,et al.  Learning to Entangle Radio Resources in Vehicular Communications: An Oblivious Game-Theoretic Perspective , 2019, IEEE Transactions on Vehicular Technology.

[50]  Mehdi Bennis,et al.  Optimized Computation Offloading Performance in Virtual Edge Computing Systems Via Deep Reinforcement Learning , 2018, IEEE Internet of Things Journal.

[51]  Marwan Krunz,et al.  Driving in the Fog: Latency Measurement, Modeling, and Optimization of LTE-based Fog Computing for Smart Vehicles , 2019, 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[52]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[53]  Xuemin Shen,et al.  Dynamic Cell Association for Non-Orthogonal Multiple-Access V2S Networks , 2017, IEEE Journal on Selected Areas in Communications.

[54]  Ahmed Syed Irshad,et al.  Markov Decision Process , 2011 .