Computation Offloading in Heterogeneous Vehicular Edge Networks: On-line and Off-policy Bandit Solutions

With the rapid advancement in vehicular communications and intelligent transportation systems technologies, task offloading in vehicular networking scenarios is emerging as a promising, yet challenging, paradigm in mobile edge computing. In this paper, we study the computation offloading problem from mobile vehicles/users, more specifically, the network- and base station selection problem, in a heterogeneous Vehicular Edge Computing (VEC) scenario, where networks have different traffic loads. In a fast-varying vehicular environment, the latency in computation offloading that arises as a result of network congestion (e.g. at the edge computing servers co-located with the base stations) is a key performance metric. However, due to the non-stationary property of such environments, predicting network congestion is an involved task. To address this challenge, we propose an on-line algorithm and an off-policy learning algorithm based on bandit theory. To dynamically select the least congested network in a piece-wise stationary environment, from the offloading history, these algorithms learn the latency that the offloaded tasks experience. In addition, to minimize the task loss due to the mobility of the vehicles, we develop a method for base station selection and a relaying mechanism in the chosen network based on the sojourn time of the vehicles. Through extensive numerical analysis, we demonstrate that the proposed learning-based solutions adapt to the traffic changes of the network by selecting the least congested network. Moreover, the proposed approaches improve the latency of offloaded tasks.

[1]  Thorsten Joachims,et al.  Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.

[2]  Hui Tian,et al.  Multiuser Joint Task Offloading and Resource Optimization in Proximate Clouds , 2017, IEEE Transactions on Vehicular Technology.

[3]  Bert Zwart,et al.  Tails in scheduling , 2007, PERV.

[4]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[5]  Daniele Tarchi,et al.  A control and data plane split approach for partial offloading in mobile fog networks , 2018, 2018 IEEE Wireless Communications and Networking Conference (WCNC).

[6]  Michèle Sebag,et al.  Multi-armed Bandit, Dynamic Environments and Meta-Bandits , 2006 .

[7]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[8]  Yan Zhang,et al.  Optimal delay constrained offloading for vehicular edge computing networks , 2017, 2017 IEEE International Conference on Communications (ICC).

[9]  Xin Liu,et al.  Adaptive Learning-Based Task Offloading for Vehicular Edge Computing Systems , 2019, IEEE Transactions on Vehicular Technology.

[10]  Junyi Wang,et al.  Adaptive application offloading decision and transmission scheduling for mobile cloud computing , 2017, China Communications.

[11]  Tarik Taleb,et al.  On Multi-Access Edge Computing: A Survey of the Emerging 5G Network Edge Cloud Architecture and Orchestration , 2017, IEEE Communications Surveys & Tutorials.

[12]  Xiaoxiang Wang,et al.  Mobility-Aware Task Offloading and Migration Schemes in Fog Computing Networks , 2019, IEEE Access.

[13]  M. de Rijke,et al.  When People Change their Mind: Off-Policy Evaluation in Non-stationary Recommendation Environments , 2019, WSDM.

[14]  Weihua Zhuang,et al.  Traffic Offloading for Online Video Service in Vehicular Networks: A Cooperative Approach , 2018, IEEE Transactions on Vehicular Technology.

[15]  Hossam S. Hassanein,et al.  Vehicle as a resource (VaaR) , 2014, IEEE Network.

[16]  Yonggang Wen,et al.  Collaborative Task Execution in Mobile Cloud Computing Under a Stochastic Wireless Channel , 2015, IEEE Transactions on Wireless Communications.

[17]  Xin Liu,et al.  Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits , 2017, ICML.

[18]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[19]  Depeng Jin,et al.  Vehicular Fog Computing: A Viewpoint of Vehicles as the Infrastructures , 2016, IEEE Transactions on Vehicular Technology.

[20]  Sheldon M. Ross,et al.  Minimizing expected discounted cost in a queueing loss model with discriminating arrivals , 2020, Eur. J. Oper. Res..

[21]  Xiaoli Chu,et al.  Computation Offloading and Resource Allocation in Vehicular Networks Based on Dual-Side Cost Minimization , 2019, IEEE Transactions on Vehicular Technology.

[22]  Setareh Maghsudi,et al.  On Power-Efficient Planning in Dynamic Small Cell Networks , 2018, IEEE Wireless Communications Letters.

[23]  Shan Suthaharan,et al.  Big data classification: problems and challenges in network intrusion prediction with machine learning , 2014, PERV.

[24]  Miroslav Dudík,et al.  Optimal and Adaptive Off-policy Evaluation in Contextual Bandits , 2016, ICML.

[25]  Yuan Zhou,et al.  Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy , 2019, ICLR.

[26]  Li Quan,et al.  A Novel Two-Layered Reinforcement Learning for Task Offloading with Tradeoff between Physical Machine Utilization Rate and Delay , 2018, Future Internet.

[27]  Wenbo Wang,et al.  A Graph-Based Cooperative Scheduling Scheme for Vehicular Networks , 2013, IEEE Transactions on Vehicular Technology.

[28]  Sherali Zeadally,et al.  VANET-cloud: a generic cloud computing model for vehicular Ad Hoc networks , 2015, IEEE Wireless Communications.

[29]  Setareh Maghsudi,et al.  Multi-Armed Bandit for Energy-Efficient and Delay-Sensitive Edge Computing in Dynamic Networks With Uncertainty , 2019, IEEE Transactions on Cognitive Communications and Networking.

[30]  Yaoliang Yu,et al.  Petuum: A New Platform for Distributed Machine Learning on Big Data , 2013, IEEE Transactions on Big Data.

[31]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[32]  Xuemin Shen,et al.  Connected Vehicles: Solutions and Challenges , 2014, IEEE Internet of Things Journal.

[33]  Minho Park,et al.  Real-Time Task Assignment Approach Leveraging Reinforcement Learning with Evolution Strategies for Long-Term Latency Minimization in Fog Computing , 2018, Sensors.

[34]  Ke Zhang,et al.  Mobile-Edge Computing for Vehicular Networks: A Promising Network Paradigm with Predictive Off-Loading , 2017, IEEE Veh. Technol. Mag..

[35]  Setareh Maghsudi,et al.  Multi-armed bandit channel selection for power line communication , 2015, 2015 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[36]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[37]  Holger Dette,et al.  A Likelihood Ratio Approach to Sequential Change Point Detection for a General Class of Parameters , 2018 .

[38]  Claude N. Williams,et al.  An Approach to Adjusting Climatological Time Series for Discontinuous Inhomogeneities , 1987 .

[39]  Qianbin Chen,et al.  Computation Offloading and Resource Allocation in Wireless Cellular Networks With Mobile Edge Computing , 2017, IEEE Transactions on Wireless Communications.

[40]  Xiaoli Chu,et al.  Computation Offloading and Resource Allocation in Mixed Fog/Cloud Computing Systems With Min-Max Fairness Guarantee , 2018, IEEE Transactions on Communications.

[41]  K. B. Letaief,et al.  A Survey on Mobile Edge Computing: The Communication Perspective , 2017, IEEE Communications Surveys & Tutorials.

[42]  Eric Moulines,et al.  On Upper-Confidence Bound Policies for Switching Bandit Problems , 2011, ALT.

[43]  Daniele Tarchi,et al.  Mobile Edge Computing Partial Offloading Techniques for Mobile Urban Scenarios , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).