Age-of-Information Bandits

We consider a system with a single source that measures/tracks a time-varying quantity and periodically attempts to report these measurements to a monitoring station. Each update from the source has to be scheduled on one of K available communication channels. The probability of success of each attempted communication is a function of the channel used. This function is unknown to the scheduler. The metric of interest is the Age-of-Information (AoI), formally defined as the time elapsed since the destination received the recent most update from the source. We model our scheduling problem as a variant of the multi-arm bandit problem with communication channels as arms. We characterize a lower bound on the AoI regret achievable by any policy and characterize the performance of UCB, Thompson Sampling, and their variants. In addition, we propose novel policies which, unlike UCB and Thompson Sampling, use the current AoI to make scheduling decisions. Via simulations, we show the proposed AoI-aware policies outperform existing AoI-agnostic policies.

[1]  Christian M. Ernst,et al.  Multi-armed Bandit Allocation Indices , 1989 .

[2]  Stratis Ioannidis,et al.  Optimal and scalable distribution of content updates over a mobile social network , 2009, IEEE INFOCOM 2009.

[3]  José Niño-Mora,et al.  Dynamic priority allocation via restless bandit marginal productivity indices , 2007, 2304.06115.

[4]  Urtzi Ayesta,et al.  Scheduling of multi-class multi-server queueing systems with abandonments , 2017, J. Sched..

[5]  Sharayu Moharir,et al.  Age of Information in Multi-Source Systems , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[6]  Eytan Modiano,et al.  A Whittle Index Approach to Minimizing Functions of Age of Information , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Vangelis Angelakis,et al.  Age of Information: A New Concept, Metric, and Tool , 2018, Found. Trends Netw..

[8]  Rémi Munos,et al.  Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.

[9]  Eytan Modiano,et al.  Scheduling Algorithms for Minimizing Age of Information in Wireless Broadcast Networks with Random Arrivals , 2017, IEEE Transactions on Mobile Computing.

[10]  Sanjay Shakkottai,et al.  Regret of Queueing Bandits , 2016, NIPS.

[11]  Sharayu Moharir,et al.  Age-of-Information Aware Scheduling for Heterogeneous Sources , 2018, MobiCom.

[12]  Jean Walrand,et al.  The c# rule revisited , 1985 .

[13]  Prakirt Raj Jhunjhunwala,et al.  Age-of-Information Aware Scheduling , 2018, 2018 International Conference on Signal Processing and Communications (SPCOM).

[14]  Eytan Modiano,et al.  Optimizing Age of Information in Wireless Networks with Throughput Constraints , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[15]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[16]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[17]  Demosthenis Teneketzis,et al.  ON THE OPTIMALITY OF AN INDEX RULE IN MULTICHANNEL ALLOCATION FOR SINGLE-HOP MOBILE NETWORKS WITH MULTIPLE SERVICE CLASSES , 2000 .

[18]  Eytan Modiano,et al.  Learning Algorithms for Minimizing Queue Length Regret , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[19]  Urtzi Ayesta,et al.  Dynamic Control of Birth-and-Death Restless Bandits: Application to Resource-Allocation Problems , 2016, IEEE/ACM Transactions on Networking.

[20]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[21]  J. V. Mieghem Dynamic Scheduling with Convex Delay Costs: The Generalized CU Rule , 1995 .

[22]  Demosthenis Teneketzis,et al.  Multi-Armed Bandit Problems , 2008 .

[23]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[24]  Keith D. Kastella,et al.  Foundations and Applications of Sensor Management , 2010 .