Efficient Learning-based Scheduling for Information Freshness in Wireless Networks

Motivated by the recent trend of integrating artificial intelligence into the Internet-of-Things (IoT), we consider the problem of scheduling packets from multiple sensing sources to a central controller over a wireless network. Here, packets from different sensing sources have different values or degrees of importance to the central controller for intelligent decision making. In such a setup, it is critical to provide timely and valuable information for the central controller. In this paper, we develop a parameterized maximum-weight type scheduling policy that combines both the AoI metrics and Upper Confidence Bound (UCB) estimates in its weight measure with parameter η. Here, UCB estimates balance the tradeoff between exploration and exploitation in learning and are critical for yielding a small cumulative regret. We show that our proposed algorithm yields the running average total age at most by O(Nη). We also prove that our proposed algorithm achieves the cumulative regret over time horizon T at most by O(NT/η+ √ NT log T ). This reveals a tradeoff between the cumulative regret and the running average total age: when increasing η, the cumulative regret becomes smaller, but is at the cost of increasing running average total age. Simulation results are provided to evaluate the efficiency of our proposed algorithm.

[1]  Bhaskar Krishnamachari,et al.  Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.

[2]  Rachid El Azouzi,et al.  Forever Young: Aging Control For Hybrid Networks , 2010, MobiHoc.

[3]  Eytan Modiano,et al.  Minimizing the Age of Information in Wireless Networks with Stochastic Arrivals , 2019, IEEE Transactions on Mobile Computing.

[4]  Pingyi Fan,et al.  Energy Harvesting Powered Sensing in IoT: Timeliness Versus Distortion , 2019, IEEE Internet of Things Journal.

[5]  Atilla Eryilmaz,et al.  Throughput-Optimal Scheduling Design With Regular Service Guarantees in Wireless Networks , 2015, IEEE/ACM Transactions on Networking.

[6]  Tamer Basar,et al.  Sampling multidimensional Wiener processes , 2014, 53rd IEEE Conference on Decision and Control.

[7]  Jia Liu,et al.  Combinatorial Sleeping Bandits with Fairness Constraints , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[8]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[9]  Avik Dayal,et al.  Experimental Analysis of Safety Application Reliability in V2V Networks , 2020, 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring).

[10]  Eytan Modiano,et al.  Age of Information: A New Metric for Information Freshness , 2019, Age of Information.

[11]  Aurélien Garivier,et al.  The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.

[12]  Wei Chen,et al.  Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.

[13]  Sanjit Krishnan Kaul,et al.  Minimizing age of information in vehicular networks , 2011, 2011 8th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks.

[14]  Alexandre Proutière,et al.  Combinatorial Bandits Revisited , 2015, NIPS.

[15]  Roy D. Yates,et al.  Lazy is timely: Status updates by an energy harvesting source , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[16]  Bin Li,et al.  Age-based Scheduling: Improving Data Freshness for Wireless Real-Time Traffic , 2018, MobiHoc.

[17]  Yin Sun,et al.  Sampling for Remote Estimation through Queues: Age of Information and Beyond , 2019, 2019 International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOPT).

[18]  Atilla Eryilmaz,et al.  Throughput-optimal wireless scheduling with regulated inter-service times , 2013, 2013 Proceedings IEEE INFOCOM.

[19]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[20]  Sharayu Moharir,et al.  Regret of Age-of-Information Bandits , 2020, IEEE Transactions on Communications.

[21]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[22]  Eytan Modiano,et al.  Optimizing Age of Information in Wireless Networks with Throughput Constraints , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[23]  Xiaojun Lin,et al.  Integrate Learning and Control in Queueing Systems with Uncertain Payoff , 2017 .

[24]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[25]  Roy D. Yates,et al.  Real-time status: How often should one update? , 2012, 2012 Proceedings IEEE INFOCOM.

[26]  Y. Narahari,et al.  Achieving Fairness in the Stochastic Multi-armed Bandit Problem , 2019, AAAI.

[27]  Roy D. Yates,et al.  Update or wait: How to keep your data fresh , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.